More readme updates
This commit is contained in:
commit
54f52b266f
|
@ -0,0 +1,87 @@
|
|||
# Overview
|
||||
|
||||
Script to scrape a given list of websites, filter for given start/end lines, then commit to a git repository.
|
||||
|
||||
Intended for keeping track of updates to terms of service, privacy policies, etc.
|
||||
|
||||
# Usage
|
||||
|
||||
```
|
||||
rob@crom:tostracker$ ./tostracker.sh -h
|
||||
usage: ./tostracker.sh OPTIONS
|
||||
|
||||
OPTIONS:
|
||||
-c filename Use given config file instead of default (./config)
|
||||
-F char Use given character as a field separator in config file instead of default (@)
|
||||
-gc After site scrapes, run 'git add' on all files, then 'git commit'
|
||||
-gp After site scrapes, run 'git add' on all files, then 'git commit', then 'git push'
|
||||
-o dirname Use given output dir instead of default (.)
|
||||
-t sitename Just output raw content of given site, useful for finding start/end regexps.
|
||||
-T sitename Just output content of given site between re_start and re_end regexps.
|
||||
|
||||
```
|
||||
|
||||
# Example Config file
|
||||
|
||||
Based on online sites used by preschools.
|
||||
|
||||
```
|
||||
seesaw_tos@https://web.seesaw.me/terms-of-service@^Terms of Service$@^Contact$
|
||||
seesaw_privacy@https://web.seesaw.me/privacy-policy@^1.*Introduction$@^Last Updated
|
||||
languagenut_tos@https://www.languagenut.com/en-au/terms/@^Terms of Service$@^FAQs$
|
||||
languagenut_privacy@https://www.languagenut.com/en-au/privacy-policy/@^PRIVACY POLICY$@^FAQs$
|
||||
acer_tos@https://www.acer.org/online-terms-of-use@^Legal agreement$@^Contact us$
|
||||
acer_privacy@https://www.acer.org/privacy@^Privacy Policies and Legal Disclaimers$@^Contact us$
|
||||
3plearning_tos@https://www.3plearning.com/terms?locate=en-AU@^Last updated@^Skip to$
|
||||
3plearning_privacy@https://www.3plearning.com/privacy?locate=en-AU@^Last updated@^Skip to$
|
||||
```
|
||||
|
||||
# Examples
|
||||
|
||||
## Basic usage - scrape only
|
||||
|
||||
```
|
||||
rob@crom:tostracker$ ./tostracker.sh
|
||||
Scraped 'seesaw_tos' to '/Users/rob/.tostracker/output/seesaw_tos.txt' (no change)
|
||||
Scraped 'seesaw_privacy' to '/Users/rob/.tostracker/output/seesaw_privacy.txt' (no change)
|
||||
Scraped 'languagenut_tos' to '/Users/rob/.tostracker/output/languagenut_tos.txt' (no change)
|
||||
Scraped 'languagenut_privacy' to '/Users/rob/.tostracker/output/languagenut_privacy.txt' (no change)
|
||||
Scraped 'acer_tos' to '/Users/rob/.tostracker/output/acer_tos.txt' (no change)
|
||||
Scraped 'acer_privacy' to '/Users/rob/.tostracker/output/acer_privacy.txt' (no change)
|
||||
Scraped '3plearning_tos' to '/Users/rob/.tostracker/output/3plearning_tos.txt' (no change)
|
||||
Scraped '3plearning_privacy' to '/Users/rob/.tostracker/output/3plearning_privacy.txt' (no change)
|
||||
```
|
||||
|
||||
## Scrape and commit results to git repo
|
||||
|
||||
```
|
||||
rob@crom:tostracker$ ./tostracker.sh -gc
|
||||
Scraped 'seesaw_tos' to '/Users/rob/.tostracker/output/seesaw_tos.txt' (no change)
|
||||
Scraped 'seesaw_privacy' to '/Users/rob/.tostracker/output/seesaw_privacy.txt' (no change)
|
||||
Scraped 'languagenut_tos' to '/Users/rob/.tostracker/output/languagenut_tos.txt' (no change)
|
||||
Scraped 'languagenut_privacy' to '/Users/rob/.tostracker/output/languagenut_privacy.txt' (no change)
|
||||
Scraped 'acer_tos' to '/Users/rob/.tostracker/output/acer_tos.txt' (no change)
|
||||
Scraped 'acer_privacy' to '/Users/rob/.tostracker/output/acer_privacy.txt' (no change)
|
||||
Scraped '3plearning_tos' to '/Users/rob/.tostracker/output/3plearning_tos.txt' (no change)
|
||||
Scraped '3plearning_privacy' to '/Users/rob/.tostracker/output/3plearning_privacy.txt' (UPDATES FOUND)
|
||||
Doing git add...ok
|
||||
Doing git commit...ok
|
||||
```
|
||||
|
||||
## Scrape, commit results to git repo, then git push to upstream
|
||||
|
||||
```
|
||||
rob@crom:tostracker$ ./tostracker.sh -gp
|
||||
Scraped 'seesaw_tos' to '/Users/rob/.tostracker/output/seesaw_tos.txt' (no change)
|
||||
Scraped 'seesaw_privacy' to '/Users/rob/.tostracker/output/seesaw_privacy.txt' (no change)
|
||||
Scraped 'languagenut_tos' to '/Users/rob/.tostracker/output/languagenut_tos.txt' (no change)
|
||||
Scraped 'languagenut_privacy' to '/Users/rob/.tostracker/output/languagenut_privacy.txt' (no change)
|
||||
Scraped 'acer_tos' to '/Users/rob/.tostracker/output/acer_tos.txt' (no change)
|
||||
Scraped 'acer_privacy' to '/Users/rob/.tostracker/output/acer_privacy.txt' (no change)
|
||||
Scraped '3plearning_tos' to '/Users/rob/.tostracker/output/3plearning_tos.txt' (no change)
|
||||
Scraped '3plearning_privacy' to '/Users/rob/.tostracker/output/3plearning_privacy.txt' (UPDATES FOUND)
|
||||
Doing git add...ok
|
||||
Doing git commit...ok
|
||||
Doing git push...ok
|
||||
```
|
||||
|
Loading…
Reference in New Issue