Ok well here's the thing: My boss has a number of html doculemts with lots of short text passages in it, that he would like to use in a web Directory of sorts. Well the problem is, in some of the HTML Documents there are over 300 text passages, and entering them all into a new Database manually would be very annoying. So I am trying to figure out how to do it automatically.
The Basic HTML Code used in the documents is the following:
<STRONG>this is a title</STRONG><BR>
<H1>This is another title</H1>
Text is in here
so every text passage starts with a <strong> tag and ends with a <hr> Tag. So I was thinking, load the contents of the file into a variable using fopen(), then use explode() to split it at the <hr> tag. Since the text passages will be saved in am array, it should be no problem to take each of the small text passages and enter them into individual fields in a Database.
Will this work?
If they were in XML format, he'd still e3ither have to rewrite the documents to fit RSS standards, or modify the reader so that it would accept the doucument format he's using.
It's possible to split the documents down into more usable chunks with the explode() function, but then you would have to do something with them in order to make them usable.
The easiest sollution I can think of would be to put each into an SQL database which can be updated easily later on.
I'd reccomend something like this as the table structure (might need you to split each article's title out):
I.D. (without the dots, I wrote it like that because it's not part of your mind)
That way, you would have the ability to look up documents by their titles or what's in the document.
Having a category field is pretty obvious if you're going to put them in any kind of directory as you would want to grab a group of results that are relevant to the subject you're looking for without hit & miss searches.
Current project: CMS Object.
Most recent change: Theme support is up and running... So long as I use my theme resource loaders instead of that in the Rails plug-in.
Release date: NEVER!!!