Commit graph

653 commits

Author SHA1 Message Date
Lucas Ou-Yang
3365d2c4c7 Merge pull request #383 from jamesmallen/gravity-score
Fix gravity score conversion
2017-06-14 14:12:18 -07:00
James M. Allen
2cb151edfe using ternary instead of and 2017-06-14 16:50:30 -04:00
James M. Allen
44555c9ad2 Fix gravity score conversion 2017-06-14 16:39:37 -04:00
codelucas
d8a022ac02 Change setup.py python3 requirement to be only in setup stage, not upload 2017-06-13 18:57:46 -07:00
codelucas
a7adfab3b7 Version bump 2.1.1 => 2.1.2 2017-06-11 03:38:18 -07:00
codelucas
fba5017691 Also change setup.py for python3 branch to error if attempting to install with python2 2017-06-11 03:25:47 -07:00
codelucas
7232d8d53a Make error warnings for download(..) clearer 2017-06-11 03:02:02 -07:00
codelucas
2df4283d4b Merge branch 'master' of https://github.com/codelucas/newspaper 2017-06-11 01:23:16 -07:00
codelucas
6f584fe1f3 Version bump 2.0.0 => 2.0.1 2017-06-11 01:23:02 -07:00
louyang
040f9528d6 Version bump 2.0.0 => 2.0.1 2017-06-11 01:22:24 -07:00
Lucas Ou-Yang
2cf14c9d5a Merge pull request #378 from codelucas/handle-http-response-download
Add logic to handle http != 2XX response failures. Inform user download happened
2017-06-11 01:07:55 -07:00
louyang
4a60cfaa34 More refactors and title logic change 2017-06-11 00:56:27 -07:00
louyang
c6cb230a6b More refactors and bugfixes 2017-06-11 00:38:12 -07:00
codelucas
f6738ec6ab Add logic to handle http != 2XX response failures. Inform user download failed 2017-06-10 02:47:57 -07:00
Lucas Ou-Yang
87d69e7c74 Make the README installation even clearer 2017-06-07 00:14:19 -07:00
Lucas Ou-Yang
e13b0e616e Clarify install script in README
A ton of people are having issues because the library name in python3 is newspaper3k, NOT newspaper. We should do something else to clarify this
2017-06-07 00:07:20 -07:00
codelucas
e54ee8a9db Bump version 0.1.9 => 0.2.0
New features and bugfixes such as:
- Python 3.3 deprecation and forward compatibility
- Feedparser upgrades
- Readme clarifications
- New stopwords for Greek
2017-06-04 02:46:27 -07:00
Lucas Ou-Yang
5e51ee2bf5 Merge pull request #356 from 12DReflections/patch-1
NLTK Corpus dependency install before newspaper3k
2017-06-04 02:39:20 -07:00
Lucas Ou-Yang
af99081634 Merge pull request #269 from Factr/master
Re-implemented parse_feeds() and removed Python 3.3
2017-06-04 02:34:22 -07:00
Lucas Ou-Yang
c474a89bbb Merge pull request #373 from tseste/master
Added Greek language support
2017-06-02 20:29:08 -07:00
Stylianos Tsesmetzis
665263d528 Update utils.py 2017-05-29 12:15:58 +03:00
Stylianos Tsesmetzis
af6b51a459 Update README.rst 2017-05-29 12:15:49 +03:00
Andreas Mavridis
221ccdf354 Added stopwords-el.txt 2017-05-29 12:15:24 +03:00
Lucas Ou-Yang
9e2571a99c Merge pull request #353 from yprez/fix-manifest-in
Exclude .pyc files from packaging
2017-05-16 14:18:01 -07:00
Julian Wise
128022dd99 NLTK Corpus dependency install before newspaper3k
The corpora is required for the install or newspaper3k or else it errors.
I tried pip installing, cloning & downloading older versions of newspaper3k before realising.

While the change seems minor it may prevent users from being deterred because they couldn't get the library working.
2017-03-30 16:13:20 +11:00
Yuri Prezument
49ad514db8 Exclude .pyc files from packaging
Ref #350
2017-03-28 14:20:35 +03:00
Adam Nelson
c56429322c Handle recursion error in Article.download when meta_refresh_url=True 2017-03-07 11:00:37 -05:00
Marmaduke Woodman
85ea81f27e handle medium user urls 2017-02-22 13:55:22 +01:00
Adam Nelson
b187c3659b Merge remote-tracking branch 'upstream/master' 2017-02-16 16:15:53 -05:00
Lucas Ou-Yang
8c87acef46 Version bump to 0.1.9 ==> Push to pypi (!hotfixing python 3.6 import error!) 2017-01-04 02:24:47 -08:00
Lucas Ou-Yang
f4724fa914 Merge pull request #313 from valaparthvi/master
Changes in Readme file
2017-01-04 02:05:30 -08:00
Romain Dorgueil
132db69434 Is newspaper3k running on python 3.6 ? (#315)
* Update .travis.yml

* Remove unused class constant "PUNCTUATION" which throws an error on python 3.6, because it contains invalid escape sequences.
2017-01-04 02:03:14 -08:00
Parthvi Vala
9963ac9252 Changes in Readme file 2016-12-28 20:22:26 +05:30
Adam Nelson
12f42ae993 Info loglevel for lxml node error 2016-12-12 15:08:29 -05:00
Adam Nelson
618d38d611 Merge remote-tracking branch 'upstream/master' 2016-12-05 11:06:46 -05:00
Lucas Ou-Yang
90d379b4fc Merge pull request #296 from Factr/fix-relative-canonical-link
Fix relative canonical link (See #228)
2016-12-04 21:12:59 -08:00
Lucas Ou-Yang
97ba0f1d2f Merge pull request #305 from Referly/alex/set_default_scheme
use http as default protocol
2016-12-04 21:04:20 -08:00
Alex Lee
439933893f use the var 2016-11-21 16:22:54 -08:00
Alex Lee
8f11a051ea use http as default protocol 2016-11-21 16:21:30 -08:00
Adam Nelson
cf9bbccb44 No longer logging exception with lxml parse warning 2016-10-18 14:51:03 -04:00
Adam Nelson
599ad12878 get_canonical_link handle hostname 2016-10-14 14:04:52 -04:00
Yuri Prezument
e96c56fa21 Merge pull request #288 from adamchainz/readthedocs.io
Convert readthedocs links for their .org -> .io migration for hosted projects
2016-10-14 18:30:44 +03:00
Adam Nelson
27ebe6d97f Addressed lxml warning with bool on doc
Check out http://stackoverflow.com/questions/18583162/difference-between-if-obj-and-if-obj-is-not-none
2016-10-10 11:26:45 -04:00
Adam Chainz
563fcd9e12 Convert readthedocs links for their .org -> .io migration for hosted projects
As per [their blog post of the 27th April](https://blog.readthedocs.com/securing-subdomains/) ‘Securing subdomains’:

> Starting today, Read the Docs will start hosting projects from subdomains on the domain readthedocs.io, instead of on readthedocs.org. This change addresses some security concerns around site cookies while hosting user generated data on the same domain as our dashboard.

Test Plan: Manually visited all the links I’ve modified.
2016-09-22 22:06:28 +01:00
Adam Nelson
823aa2b007 Return None when parsing a date_str of None 2016-07-26 15:14:07 -04:00
Adam Nelson
ea98516a40 Added /feed to common_feed_urls 2016-07-25 17:25:11 -04:00
Adam Nelson
6ef2a23ef3 Better warning logging and handle invalid rss doc better 2016-07-18 17:11:15 -04:00
Adam Nelson
0614f3fb73 Handle no response scenario in Source(url=bad_url) 2016-07-12 17:00:39 -04:00
Adam Nelson
601d3a4a2d Look at common urls like /feeds and /rss to find feeds 2016-07-12 16:02:07 -04:00
Adam Nelson
8c514dad59 Merge remote-tracking branch 'upstream/master' 2016-07-11 15:31:40 -04:00