When I had to extract data, there was no easy way - at least I didn't find any. This little VBScript routine saved the day. I had to take short cuts to save time (to avoid all the authentication issues, I first open the site in IE and then run this. It does not use the same instance but new instance already have access to the site. And gave a 4 second sleep to complete the page load instead of waiting for an event).
Note that ~ is used as the delimiter to allow commas in data and non-breakable spaces in the table are replaced by regular spaces.
set FSO = CreateObject("Scripting.FileSystemObject") Set IE = WScript.CreateObject("InternetExplorer.Application", "IE_") IE.Visible = True recIndex=1 set outFile = FSO.CreateTextFile(".\outfile.csv") IE.Navigate "https://sites.google.com/site/<mysite/mypage>" wscript.sleep 4000 tmpSTR = IE.Document.DocumentElement.innerhtml set objTable = IE.Document.GetElementByID("goog-ws-list-table") set colRow = objTable.GetElementsByTagName("tr") For each objRow in colRow set colFields = objRow.GetElementsByTagName("TD") If Not isnull(colFields) then recIndex = recIndex + 1 outfile.Write recIndex For each objField in colFields outfile.Write "~ " & trim(replace(objField.InnerHTML," ",chr(32),1,-1)) Next outFile.Writeline Next outfile.Close MsgBox "Done"
No comments:
Post a Comment