N2DFire
01-22-2005, 15:20
Alright, I know there should be an easy way to do this but for the life of me I can't find it.
I (or actually my g/f) has a text book that has a web site with additional study aids. One of these aids is a section of Flash Cards. She also has a nifty little flash card program for her PDA that will take a .txt file (in proper format) and display them as flash cards.
What we are wanting to do is somehow extract the data from the web site's flash cards and put them into a text file.
I'm comfortable enough with VB.NET text file manipulation (streamreader) that this shouldn't be a problem however I can't get to the dumb HTML files to open them (because streamwriter won't accept a URL) and I can't seem to find a good program to copy the website to my computer HD.
The web site is set up such that there is a flashcard page that contains a lot of javascript that makes the system work. Under that there are sub folders for each chapter
/Chapter1
.
.
.
/ChapterXX
In each chapter folder there are card files
/card1.html
/card2.html
.
.
.
/cardXX.html
Every card has the following format
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
******
Each flashcard file contains term and definition for one card
as JavaScript variables. Page is written dynamically via
JavaScript function. All logic and style resides in shared files.
State variables in parent frameset determine how/when a single card
appears.
-->
<html>
<head>
<title>Card</title>
******** language="JavaScript">
//data for this card
var term = "adenohypophysis "
var def = "The anterior lobe of the pituitary gland."
var audio = "none"
</script>
******** language="JavaScript" src="../card.js"></script>
<LINK REL="Stylesheet" TYPE="text/css" HREF="../card.css">
</head>
****** *********"#ffffff" ************"../card.gif">
******** language="JavaScript">
//write the card
writeCard()
</script>
********>
********>
This page is http://media.pearsoncmg.com/bc/bc_martini_fap_6/flashcards/chapter18/card1.html
I can call each card page up on it's own but because the system was written as a frameset the code required to make it display properly is not present.
What I need in a nutshell is a way to extract the value of var term & var def from each card file so that I can then write them out into a formatted .txt file for the PDA flashcard program.
Any help with accessing the on-line HTML files via VB.NET or an good program to cache them to my HD so I can do it the "old" way I know would be greatly appreciated.
Thanks in Advance
Edited to fix URL
I (or actually my g/f) has a text book that has a web site with additional study aids. One of these aids is a section of Flash Cards. She also has a nifty little flash card program for her PDA that will take a .txt file (in proper format) and display them as flash cards.
What we are wanting to do is somehow extract the data from the web site's flash cards and put them into a text file.
I'm comfortable enough with VB.NET text file manipulation (streamreader) that this shouldn't be a problem however I can't get to the dumb HTML files to open them (because streamwriter won't accept a URL) and I can't seem to find a good program to copy the website to my computer HD.
The web site is set up such that there is a flashcard page that contains a lot of javascript that makes the system work. Under that there are sub folders for each chapter
/Chapter1
.
.
.
/ChapterXX
In each chapter folder there are card files
/card1.html
/card2.html
.
.
.
/cardXX.html
Every card has the following format
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
******
Each flashcard file contains term and definition for one card
as JavaScript variables. Page is written dynamically via
JavaScript function. All logic and style resides in shared files.
State variables in parent frameset determine how/when a single card
appears.
-->
<html>
<head>
<title>Card</title>
******** language="JavaScript">
//data for this card
var term = "adenohypophysis "
var def = "The anterior lobe of the pituitary gland."
var audio = "none"
</script>
******** language="JavaScript" src="../card.js"></script>
<LINK REL="Stylesheet" TYPE="text/css" HREF="../card.css">
</head>
****** *********"#ffffff" ************"../card.gif">
******** language="JavaScript">
//write the card
writeCard()
</script>
********>
********>
This page is http://media.pearsoncmg.com/bc/bc_martini_fap_6/flashcards/chapter18/card1.html
I can call each card page up on it's own but because the system was written as a frameset the code required to make it display properly is not present.
What I need in a nutshell is a way to extract the value of var term & var def from each card file so that I can then write them out into a formatted .txt file for the PDA flashcard program.
Any help with accessing the on-line HTML files via VB.NET or an good program to cache them to my HD so I can do it the "old" way I know would be greatly appreciated.
Thanks in Advance
Edited to fix URL