MeetVaishnav
56ced00176
Merge c1b36a2cdc into 648fb2a18b
2025-12-17 15:55:51 +08:00
Lucas Ou-Yang
648fb2a18b
remove RapidProxy
2025-12-06 08:29:46 +08:00
Lucas Ou-Yang
4ab043bf08
remove thordata advert from readme
2025-11-23 21:43:38 -08:00
Lucas Ou-Yang
90441f027e
Update README with new proxy services and remove old content
2025-11-03 00:32:49 -08:00
Lucas Ou-Yang
546c4cd617
Add Thordata + proxy example
2025-10-22 20:52:44 -07:00
Lucas Ou-Yang
8901de7451
remove bd from readme ads
2025-10-14 22:41:54 +07:00
Lucas Ou-Yang
6c1dcddd3a
Update README.rst
2025-08-14 15:56:11 +07:00
Lucas Ou-Yang
bde5d50f5d
Update README.rst - sponsors
2025-07-11 07:52:35 -07:00
Lucas Ou-Yang
1b5fce18d4
Update README.rst - more sponsors
2025-07-10 23:35:11 -07:00
Lucas Ou-Yang
b39a4b407f
Update README.rst include sponsors
2025-07-10 23:25:16 -07:00
Lucas Ou-Yang
ba8d2f41be
Update README.rst
2025-03-06 17:44:47 -08:00
MeetVaishnav
c1b36a2cdc
Update README.rst
2020-10-01 21:30:51 +05:30
Lucas Ou-Yang
f622011177
Update README.rst
2020-09-01 23:54:25 -07:00
Lucas Ou-Yang
5af1bea20f
Update README.rst
2020-07-12 18:16:14 -07:00
Lucas Ou-Yang
b0cc1278c4
Update README.rst
2020-07-04 19:34:46 -07:00
Lucas Ou-Yang
4b35117e7e
Update README.rst
2020-07-03 12:10:45 -07:00
Lucas Ou-Yang
2f6ca8fa63
Update README.rst
2020-07-03 12:09:51 -07:00
Lucas Ou-Yang
db81b55aab
Update README.rst
2020-07-03 12:04:24 -07:00
Lucas Ou-Yang
1c27e6da19
Update README.rst
...
Project support
2020-06-27 19:49:08 -07:00
Bachstelze
837bd13e96
changed 404 url ( #819 )
2020-06-25 22:47:35 -07:00
Lucas Ou-Yang
56de65af9e
Update README.rst
2020-06-22 16:36:27 -07:00
Lucas Ou-Yang
cba0658011
Update README.rst
2020-06-22 16:35:16 -07:00
Kyle Jones
a0f725333a
Dropping python 3.4 support ( #768 )
...
* Dropping python 3.4 support
* Fixing build issues
* Changed version number - incremented major version due to breaking change
* Removing pandas dependency
2020-06-22 13:38:44 -07:00
Lucas Ou-Yang
1c7feb1c55
Update README, gitad setup
2020-06-20 11:59:42 -07:00
Lucas Ou-Yang
9b89046d07
Add donations links in readme
2019-04-12 08:23:13 -04:00
Lucas Ou-Yang
cf85a7eadf
Modify readme
2019-04-07 16:31:52 -04:00
Lucas Ou-Yang
2788a2fdcd
Replace patreon with consulting
2019-04-07 16:00:31 -04:00
Akash Nidhi P S
c258db1e54
Added more stopwords for stopwords-hi.txt ( #675 )
2019-03-16 20:58:27 -04:00
Guy Rosin
069a437920
Update extractors.py ( #688 )
...
Added a date tag
2019-03-16 20:56:20 -04:00
bact
4c9cde0749
Add Thai stopwords ( #669 )
...
* Add Thai stopwords from stopwordsiso
* add "th" to language_dict
* add unit test and test data files for Thai language
* - add pythainlp to requirements.txt
- sort requirements.txt
* Update and sort supported language list
* sort the language list
* update language list in docs/index.rst
2019-03-16 20:53:04 -04:00
Lucas Ou-Yang
1cb6a1b143
Another edit to the Patreon change
2019-03-10 21:26:19 -04:00
Lucas Ou-Yang
0deaaa1ec5
Add Patreon support page.
2019-03-10 21:22:26 -04:00
danieleago
1ee2fdbfa5
Update stopwords-it.txt ( #660 )
...
fix strip word
2019-01-04 20:30:09 -08:00
ekaterinasmarp
11cbf3a303
Ignoring http pages depending on their content-type ( #658 )
...
* Ignoring http pages depending on their content-type, PDFs are ignored by default
* Code review fixes
* Code review fixes
* Code review fixes
* Code review fixes
2018-12-27 07:06:06 -08:00
Agnel Vishal
162c168e8d
Updated comments. ( #659 )
...
* Updated comments.
The previous comment was difficult to understand.
* Changed as requested.
* Removed space
2018-12-05 00:02:26 +09:00
Torben Brodt
e84b666136
Fix extracting proper author information with nested tags ( #651 )
...
tested with 7392243
where author is in nested structure
```
<span itemprop="author" itemscope itemtype="http://schema.org/Person ">
<span itemprop="name">
Klaus Egelund</span>
</span>
```
2018-12-04 00:48:31 +09:00
Piotr Grzesik
c8a0455b81
Add extraction of meta_site_name ( #630 )
2018-10-28 13:25:14 -07:00
Evaldas Kazlauskis
7f388b37a7
Adding lithuanian language support ( #639 )
2018-10-27 15:09:09 -07:00
Piotr Grzesik
4013b6ad04
Skip removing last diff it it's known that it has non-media class ( #633 )
2018-10-26 11:37:14 -07:00
Piotr Grzesik
1d095200b1
Fix broadcasting typo in cleaners ( #634 )
2018-10-11 18:50:14 +07:00
Dan Robertson
4a540cbcd9
Handle file scheme in Article.download ( #598 )
...
- Update the Article download function to handle the file scheme.
- Add test cases for using Article.download with a file url
2018-10-04 12:11:50 +07:00
Lucas Ou-Yang
d1766a8b84
Change package management script to use twine
2018-09-27 22:03:59 -07:00
Lucas Ou-Yang
7dc200fa31
Version bump: 0.2.7 => 0.2.8
2018-09-27 21:53:09 -07:00
Nuno Pinheiro
9af47d1e25
Added div_to_para transformation for section tags ( #627 )
2018-09-22 14:53:29 -07:00
Lucas Ou-Yang
8fa5ae1546
Add Article constructor param sanitization guards ( #623 )
2018-09-05 00:34:07 -07:00
Sam Fonseca
146c4fd304
add "byl" attr val to byline parsing ( #619 )
2018-09-04 23:50:55 -07:00
Lucas Ou-Yang
7a39f9d717
Kill print(..) statements and replace with logging or exceptions ( #622 )
2018-09-04 23:47:05 -07:00
Lucas Ou-Yang
2f7fc40ac9
Improve mthreading.py code, add override threads option, remove unused ( #618 )
2018-09-02 23:21:22 -07:00
Shevchenko Vitaliy
beacce0e16
Use try except on creating dirs to avoid FileExistsError because of race condition ( #617 )
2018-08-31 18:01:35 -07:00
Lev E. Givon
f8b8e52d14
Enable async downloading of individually specified articles. ( #548 )
2018-08-30 01:04:28 -07:00