Website Sitemap Scraper

The Website Sitemap Scraper is a Python script that allows you to fetch and extract sitemap links from a website. This tool is useful for collecting information about a website's structure and content.

Features

Fetches sitemap links from a specified website.
Saves the sitemap links to a text file for future reference.

Prerequisites

Before you begin, ensure you have met the following requirements:

Python 3.7 or higher installed on your system.
The following Python libraries installed:
- httpx: Used for making asynchronous HTTP requests.
- selectolax: Used for parsing HTML/XML content.

You can install the required libraries using pip:

pip install -r requirements.txt

Usage

Clone this repository to your local machine:

git clone https://github.com/your-username/Sitemap-Postlink-Scraper.git

Navigate to the project directory:

cd Sitemap-Postlink-Scraper

Run the script:

python sitemap_post_scraper.py

Follow the on-screen instructions to provide the URL of the website you want to scrape.
If a sitemap is found on the website, the script will fetch and save the sitemap links to a text file named _sitemap_links.txt.

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
README.md		README.md
requirements.txt		requirements.txt
sitemap_post_scraper.py		sitemap_post_scraper.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Website Sitemap Scraper

Features

Prerequisites

Usage

About

Packages

Languages

SSujitX/Sitemap-Postlink-Scraper

Folders and files

Latest commit

History

Repository files navigation

Website Sitemap Scraper

Features

Prerequisites

Usage

About

Resources

Stars

Watchers

Forks

Packages 0

Languages

Packages