Course lesson

Scrape a Website on a Schedule with Script Kit

When you want to collect news sources, airline ticket prices, or any events from sites that don't offer APIs, you can use scrapers to grab elements from off the page.

Duration
3 min
Access
Free
Transcript
Retained from source evidence

When you want to collect news sources, airline ticket prices, or any events from sites that don't offer APIs, you can use scrapers to grab elements from off the page.

Script Kit includes a scrapeSelector() helper that takes the URL you want to scrape and the selector you want from the page. Using the // Schedule metadata, you can also have this script run in the background on a Chron schedule and collect the data for you.

Install scrape-tech-news

// Name: Scrape Tech News
// Schedule: 0 11 * * *

import "@johnlindquist/kit"

let h3s = await scrapeSelector(
  "https://news.google.com/topics/CAAqJggKIiBDQkFTRWdvSUwyMHZNRGRqTVhZU0FtVnVHZ0pWVXlnQVAB?hl=en-US&gl=US&ceid=US%3Aen",
  "h3"
)

let filePath = home("tech.md")
await ensureFile(filePath)
let contents =
  `

## ${new Date()}

` + h3s.map(h3 => `### ${h3}`).join("\n")
await appendFile(filePath, contents)