Im not even sure how to describe my problem

JaysonJayson BeginnerLink Clerk
Ok well here's the thing: My boss has a number of html doculemts with lots of short text passages in it, that he would like to use in a web Directory of sorts. Well the problem is, in some of the HTML Documents there are over 300 text passages, and entering them all into a new Database manually would be very annoying. So I am trying to figure out how to do it automatically.

The Basic HTML Code used in the documents is the following:

<STRONG>this is a title</STRONG><BR>
<H1>This is another title</H1>
Text is in here
<HR>

so every text passage starts with a <strong> tag and ends with a <hr> Tag. So I was thinking, load the contents of the file into a variable using fopen(), then use explode() to split it at the <hr> tag. Since the text passages will be saved in am array, it should be no problem to take each of the small text passages and enter them into individual fields in a Database.

Will this work?

Comments

  • BeachieBeachie Moderator: Raging Bender Link Clerk
    that should work, but you'll probably need to use a few splits. you could try checking out some of the PHP RSS readers, which do a similar thing with XML instead of HTML.
    Gadget.com | Everybody.com | BulletinBoard.com | Stupidity.com | Aquatic.com | Cuddly.com | Blogs.org | Newcomers.com | Wealth.org
  • NuvoNuvo Forum Leader VPS - Virtual Prince of the Server
    An RSS or other feed reader (RSS1, RSS2 or Atom) wouldn't be of much use as he's looking for a way to work with HTML.
    If they were in XML format, he'd still e3ither have to rewrite the documents to fit RSS standards, or modify the reader so that it would accept the doucument format he's using.

    It's possible to split the documents down into more usable chunks with the explode() function, but then you would have to do something with them in order to make them usable.
    The easiest sollution I can think of would be to put each into an SQL database which can be updated easily later on.
    I'd reccomend something like this as the table structure (might need you to split each article's title out):
    I.D. (without the dots, I wrote it like that because it's not part of your mind)
    Title
    Subtitle
    Content
    Category

    That way, you would have the ability to look up documents by their titles or what's in the document.
    Having a category field is pretty obvious if you're going to put them in any kind of directory as you would want to grab a group of results that are relevant to the subject you're looking for without hit & miss searches.
    PHP, CSS, XHTML, Delphi, Ruby on Rails & more.
    Current project: CMS Object.
    Most recent change: Theme support is up and running... So long as I use my theme resource loaders instead of that in the Rails plug-in.
    Release date: NEVER!!!
Sign In or Register to comment.