README.md - Issue 29968569: Issue 4014 - Publish python-abp on PyPI

Unified Diff: README.md

Issue 29968569: Issue 4014 - Publish python-abp on PyPI (Closed) Base URL: https://hg.adblockplus.org/python-abp/

Patch Set: Address comments on PS2, add README ToC Created Dec. 29, 2018, 1:29 a.m.

Use n/p to move between diff chunks; N/P to move between comments.

Jump to:

View side-by-side diff with in-line comments

Index: README.md

===================================================================

--- a/README.md

+++ b/README.md

@@ -1,10 +1,23 @@

# python-abp

-This repository contains a library for working with Adblock Plus filter lists

-and the script that is used for building Adblock Plus filter lists from the

-form in which they are authored into the format suitable for consumption by the

-adblocking software.

+This repository contains a library for working with Adblock Plus filter lists,

+a script for rendering diffs between filter lists, and the script that is used

+for building Adblock Plus filter lists from the form in which they are authored

+into the format suitable for consumption by the adblocking software (aka

+rendering).

+## Table of Contents

+- [Installation](#installation)

+- [Rendering of filter lists](#rendering)

+- [Generating diffs](#diffs)

+- [Library API](#library)

+- [Testing](#testing)

+- [Development](#development)

+- [Using the library with R](#r)

+<a id="installation"></a>

## Installation

Prerequisites:

@@ -15,16 +28,17 @@

To install:

- $ pip install -U python-abp

+ $ pip install --upgrade python-abp

+<a id="rendering"></a>

## Rendering of filter lists

The filter lists are originally authored in relatively smaller parts focused

-on a particular type of filters, related to a specific topic or relevant

-for particular geographical area.

-We call these parts _filter list fragments_ (or just _fragments_)

-to distinguish them from full filter lists that are

-consumed by the adblocking software such as Adblock Plus.

+on particular types of filters, related to a specific topic or relevant for a

+particular geographical area.

+We call these parts _filter list fragments_ (or just _fragments_) to

+distinguish them from full filter lists that are consumed by the adblocking

+software such as Adblock Plus.

Rendering is a process that combines filter list fragments into a filter list.

It starts with one fragment that can include other ones and so forth.

@@ -34,17 +48,17 @@

$ flrender fragment.txt filterlist.txt

-This will take the top level fragment in `fragment.txt`, render it and save into

-`filterlist.txt`.

+This will take the top level fragment in `fragment.txt`, render it and save it

+into `filterlist.txt`.

The `flrender` script can also be used by only specifying `fragment.txt`:

- $flrender fragment.txt

+ $ flrender fragment.txt

in which case the rendering result will be sent to `stdout`. Moreover, when

it's run with no positional arguments:

- $flrender

+ $ flrender

it will read from `stdin` and send the results to `stdout`.

@@ -54,25 +68,25 @@

%include http://www.server.org/dir/list.txt%

%include easylist:easylist/easylist_general_block.txt%

-The first instruction contains a URL that will be fetched and inserted at the

-point of reference.

-The second one contains a path inside easylist repository.

+The http include contains a URL that will be fetched and inserted at the point

+of reference.

+The local include contains a path inside the easylist repository.

`flrender` needs to be able to find a copy of the repository on the local

filesystem. We use `-i` option to point it to to the right directory:

$ flrender -i easylist=/home/abc/easylist input.txt output.txt

-Now the second reference above will be resolved to

-`/home/abc/easylist/easylist/easylist_general_block.txt` and the fragment will

-be loaded from this file.

+Now the local include referenced above will be resolved to:

+`/home/abc/easylist/easylist/easylist_general_block.txt`

+and the fragment will be loaded from this file.

Directories that contain filter list fragments that are used during rendering

are called sources.

They are normally working copies of the repositories that contain filter list

fragments.

-Each source is identified by a name: that's the part that comes before ":"

-in the include instruction and it should be the same as what comes before "="

-in the `-i` option.

+Each source is identified by a name: that's the part that comes before ":" in

+the include instruction and it should be the same as what comes before "=" in

+the `-i` option.

Commonly used sources have generally accepted names. For example the main

EasyList repository is referred to as `easylist`.

@@ -86,24 +100,25 @@

You can clone the necessary repositories to a local directory and add `-i`

options accordingly.

-## Rendering diffs

+<a id="diffs"></a>

+## Generating diffs

A diff allows a client running ad blocking software such as Adblock Plus to

update the filter lists incrementally, instead of downloading a new copy of a

full list during each update. This is meant to lessen the amount of resources

used when updating filter lists (e.g. network data, memory usage, battery

-consumption, etc.), allowing clients to update their lists more frequently using

-less resources.

+consumption, etc.), allowing clients to update their lists more frequently

+using less resources.

-Python-abp contains a script called `fldiff` that will find the diff between the

-latest filter list, and any number of previous filter lists:

+python-abp contains a script called `fldiff` that will find the diff between

+the latest filter list, and any number of previous filter lists:

- $ fldiff -o diffs/easylist easylist.txt archive/*

+ $ fldiff -o diffs/easylist/ easylist.txt archive/*

-where `-o diffs/easylist` is the (optional) output directory where the diffs

-should be written, `easylist.txt` is the most recent version of the filter list,

-and `archive/*` is the directory where all the archived filter lists are. When

-called like this, the shell should automatically expand the `archive/*`

+where `-o diffs/easylist/` is the (optional) output directory where the diffs

+should be written, `easylist.txt` is the most recent version of the filter

+list, and `archive/*` is the directory where all the archived filter lists are.

+When called like this, the shell should automatically expand the `archive/*`

directory, giving the script each of the filenames separately.

In the above example, the output of each archived `list[version].txt` will be

@@ -117,10 +132,10 @@

* Added filters of the form `+ <filter-text>`

* Removed filters of the form `- <filter-text>`

+<a id="library"></a>

## Library API

-Python-abp can also be used as a library for parsing filter lists. For example

+python-abp can also be used as a library for parsing filter lists. For example

to read a filter list (we use Python 3 syntax here but the API is the same):

from abp.filters import parse_filterlist

@@ -129,7 +144,7 @@

for line in parse_filterlist(filterlist):

print(line)

-If `filterlist.txt` contains a filter list:

+If `filterlist.txt` contains this filter list:

[Adblock Plus 2.0]

! Title: Example list

@@ -137,7 +152,6 @@

abc.com,cdf.com##div#ad1

abc.com/ad$image

@@/abc\.com/

- ...

the output will look something like:

@@ -147,26 +161,24 @@

Filter(text='abc.com,cdf.com##div#ad1', selector={'type': 'css', 'value': 'div#ad1'}, action='hide', options=[('domain', [('abc .com', True), ('cdf.com', True)])])

Filter(text='abc.com/ad$image', selector={'type': 'url-pattern', 'value': 'abc.com/ad'}, action='block', options=[('image', True)])

Filter(text='@@/abc\\.com/', selector={'type': 'url-regexp', 'value': 'abc\\.com'}, action='allow', options=[])

- ...

-`abp.filters` module also exports a lower-level function for parsing individual

-lines of a filter list: `parse_line`. It returns a parsed line object just like

-the items in the iterator returned by `parse_filterlist`.

+The `abp.filters` module also exports a lower-level function for parsing

+individual lines of a filter list: `parse_line`. It returns a parsed line

+object just like the items in the iterator returned by `parse_filterlist`.

For further information on the library API use `help()` on `abp.filters` and

-its contents in interactive Python session, read the docstrings or look at the

-tests for some usage examples.

+its contents in an interactive Python session, read the docstrings, or look at

+the tests for some usage examples.

+<a id="testing"></a>

## Testing

-Unit tests for `python-abp` are located in the `/tests` directory.

-[Pytest][2] is used for quickly running the tests

-during development.

-[Tox][3] is used for testing in different

-environments (Python 2.7, Python 3.5+ and PyPy) and code quality

-reporting.

+Unit tests for `python-abp` are located in the `/tests` directory. [Pytest][2]

+is used for quickly running the tests during development. [Tox][3] is used for

+testing in different environments (Python 2.7, Python 3.5+ and PyPy) and code

+quality reporting.

-In order to execute the tests, first create and activate development

+In order to execute the tests, first create and activate a development

virtualenv:

$ python setup.py devenv

@@ -180,17 +192,18 @@

(devenv) $ tox

+<a id="development"></a>

## Development

-When adding new functionality, add tests for it (preferably first). Code

-coverage (as measured by `tox -e qa`) should not decrease and the tests

-should pass in all Tox environments.

+When adding new functionality, add tests for it (preferably first). If some

+code will never be reached on a certain version of Python, it may be exempted

+from coverage tests by adding a comment, e.g. `# pragma: no py2 cover`.

All public functions, classes and methods should have docstrings compliant with

[NumPy/SciPy documentation guide][4]. One exception is the constructors of

classes that the user is not expected to instantiate (such as exceptions).

+<a id="r"></a>

## Using the library with R

Clone the repo to you local machine. Then create a virtualenv and install

« no previous file with comments | « LICENSE ('k') | abp/filters/sources.py » ('j') | no next file with comments »