Course lesson
Scrape a Website on a Schedule with Script Kit
When you want to collect news sources, airline ticket prices, or any events from sites that don't offer APIs, you can use scrapers to grab elements from off the page.
- Duration
- 3 min
- Access
- Free
- Transcript
- Retained from source evidence
When you want to collect news sources, airline ticket prices, or any events from sites that don't offer APIs, you can use scrapers to grab elements from off the page.
Script Kit includes a scrapeSelector() helper that takes the URL you want to scrape and the selector you want from the page. Using the // Schedule metadata, you can also have this script run in the background on a Chron schedule and collect the data for you.
// Name: Scrape Tech News
// Schedule: 0 11 * * *
import "@johnlindquist/kit"
let h3s = await scrapeSelector(
"https://news.google.com/topics/CAAqJggKIiBDQkFTRWdvSUwyMHZNRGRqTVhZU0FtVnVHZ0pWVXlnQVAB?hl=en-US&gl=US&ceid=US%3Aen",
"h3"
)
let filePath = home("tech.md")
await ensureFile(filePath)
let contents =
`
## ${new Date()}
` + h3s.map(h3 => `### ${h3}`).join("\n")
await appendFile(filePath, contents)