Commit graph

350 commits

Author SHA1 Message Date
Lucas Ou-Yang
b9dac8d1fb Revamped all unit tests so that every request is mocked (unit tests will now run for everyone!)
Deleted a bunch of unused stray data files under /tests/data.
2014-10-12 16:08:01 -07:00
Lucas Ou-Yang
4e31fc3124 Update version from 0.0.7 to 0.0.8 2014-10-12 14:13:29 -07:00
Lucas Ou-Yang
6da4fa96e3 Remove unneeded files 2014-10-12 14:10:07 -07:00
Igor Shevchenko
6aa85d835e Add slash splitter 2014-08-15 21:38:42 +06:00
Lucas Ou-Yang
d6d16134d0 [bugfix] Fix wrong equality comparision
Reference this fix in the repo of a library we used to be dependent on: https://github.com/grangier/python-goose/pull/114
2014-08-05 19:33:02 -07:00
Lucas Ou-Yang
3053ea2720 [bugfix] Removed bad reference in setup.py
`Setup.py` used to require a package directory named
`data`, this directory was removed in a previous
commit.
2014-08-05 18:56:37 -07:00
Lucas Ou
25daa2b679 Merge pull request #71 from codelucas/huge-refactor-codelucas
Huge refactor: entire codebase in PEP8, imports alphabetized, bugfixes, core changes
2014-08-05 00:12:38 -07:00
Lucas Ou-Yang
81329a138a Merge branch 'huge-refactor-codelucas' of https://github.com/codelucas/newspaper into huge-refactor-codelucas
Conflicts:
	newspaper/extractors.py
2014-08-03 22:22:08 -07:00
Lucas Ou-Yang
9db8a96c0e [refactor] Methods from extractors, formatters, cleaners now take item params, not article or source objects 2014-08-03 22:19:42 -07:00
Lucas Ou-Yang
554ec654fc [refactor] Methods from extractors, formatters, cleaners now take item params, not article or source objects 2014-08-03 22:10:36 -07:00
Lucas Ou-Yang
dd577abdae [refactor] Adhere entire codebase to pep8 2014-08-03 20:12:00 -07:00
Lucas Ou-Yang
b436d38883 [bugfix] Forgot to init logger object in extractors.py 2014-08-03 16:06:51 -07:00
Lucas Ou-Yang
43e6ebbc09 [bugfix] Broken feed extraction, b/c of unescaped non-regex chars 2014-08-03 16:06:22 -07:00
Lucas Ou-Yang
4042badd63 [refactor] Alphabetized imports, structured according to pep8
Also fixed spacing between imports and body code if needed.
2014-08-03 16:00:23 -07:00
Lucas Ou-Yang
be4c9fc99f Merge branch 'master' of https://github.com/codelucas/newspaper 2014-08-03 11:05:48 -07:00
Lucas Ou-Yang
9bbbeaa69b Merge branch 'master' of https://github.com/codelucas/newspaper 2014-08-03 11:02:08 -07:00
Lucas Ou-Yang
75883895b0 Merge branch 'master' of https://github.com/codelucas/newspaper 2014-08-03 10:57:29 -07:00
Lucas Ou-Yang
246ad93eaa [Bugfix] Fallback to 'en' for bogus extracted meta langs
Also patched a bug where the `stopwords_class` was not updated
properly for when non-latin languages were extracted from
meta tags. Look at the updated unit tests.
2014-08-03 10:57:20 -07:00
Lucas Ou-Yang
0d812d81c7 [Bugfix] Fallback to 'en' for bogus extracted meta langs
Also patched a bug where the `stopwords_class` was not updated
properly for when non-latin languages were extracted from
meta tags. Look at the updated unit tests.
2014-08-03 10:44:04 -07:00
Lucas Ou-Yang
95299356cc Refactor miscellaneous data files to more sensible location. 2014-08-01 01:07:23 -07:00
Lucas Ou-Yang
6dfcf7820e Increased speed of NLP by 8x via set lookups vs list 2014-08-01 00:49:30 -07:00
Lucas Ou-Yang
c72f408e94 Refactor and added tests for meta data extraction 2014-07-31 22:43:00 -07:00
Lucas Ou
e8e99bc740 Merge pull request #69 from karls/meta-tag-extraction-fixes
Meta tag extraction fixes
2014-07-31 21:39:15 -07:00
Karl Sutt
bdda57137a Fix meta extraction.
Meta tags were incorrectly extracted when the meta key was not in the
form of "foo:bar". The resulting value was, incorrectly, an empty dict.
2014-07-31 21:37:20 -07:00
Karl Sutt
ea824f89d2 Merge branch 'test-suite-improvements' into meta-tag-extraction-fixes 2014-07-31 10:02:48 +01:00
Lucas Ou
0066978337 Merge pull request #68 from karls/test-suite-improvements
Test suite improvements
2014-07-30 16:38:05 -07:00
Karl Sutt
1154dc4136 Update requirements 2014-07-30 19:54:34 +01:00
Karl Sutt
826b912aee Uncomment a test 2014-07-29 22:00:20 +01:00
Lucas Ou
676b10a8bf Merge pull request #67 from karls/test-suite-fixes
Test suite fixes
2014-07-29 13:53:32 -07:00
Karl Sutt
02dc291b2e Fix meta extraction.
Meta tags were incorrectly extracted when the meta key was not in the
form of "foo:bar". The resulting value was, incorrectly, an empty dict.
2014-07-28 16:25:51 +01:00
Karl Sutt
02383cdd6f Mock HTTP requests
Mock out any real HTTP requests made from within tests. Responses are
now deterministic, as the body of the response is loaded from a HTML file.
2014-07-28 13:55:57 +01:00
Lucas Ou
85a912d62a Merge pull request #66 from codelucas/revert-63-master
Revert "Added published date to the extractor+article"
2014-07-28 00:23:03 -07:00
Lucas Ou
59b3cbea9d Revert "Added published date to the extractor+article" 2014-07-28 00:22:42 -07:00
Lucas Ou
9179db51b0 Merge pull request #63 from skinnyp/master
Added published date to the extractor+article
2014-07-28 00:22:11 -07:00
Lucas Ou-Yang
6ab9ee5ec6 [bugfix] Add language display key for Norwegian (Bokmål) 2014-07-27 23:18:14 -07:00
Karl Sutt
6dc7b174b4 Lock package versions in requirements.txt 2014-07-24 10:55:43 +01:00
Karl Sutt
0d02da4d50 ✂️ whitespace EOF 2014-07-23 17:27:15 +01:00
Karl Sutt
1973a9bd53 CNN's description has changed. Fix test accordingly 2014-07-23 17:26:46 +01:00
Karl Sutt
bf7437b27a Download and parse article before doing any NLP 2014-07-23 17:26:09 +01:00
Karl Sutt
f5e6f19ff8 Download before parsing. Fix number of images assertion 2014-07-23 17:25:16 +01:00
Karl Sutt
634f0990de Fix threading tests -- config passed incorrectly 2014-07-23 17:24:10 +01:00
Karl Sutt
94d51bdf08 Use build() instead of download() and parse() 2014-07-23 17:23:44 +01:00
Karl Sutt
223447670c Move setting up the test to setUp() (otherwise fails) 2014-07-23 17:21:35 +01:00
Karl Sutt
2f201adbc7 Meta tag extraction fix (need to build article first) 2014-07-23 17:19:59 +01:00
Karl Sutt
69b00c8554 Fix a multilingual test failure (due to newline EOF) 2014-07-23 17:18:15 +01:00
Karl Sutt
6ef0d28445 Remove stray newlines at the end of files 2014-07-23 17:17:19 +01:00
Karl Sutt
20796fcc6a Make tests/ directory a package. 2014-07-23 17:16:02 +01:00
Parham Saidi
827b8a014e added published date to the extractor+article 2014-07-16 16:42:53 +00:00
Parham Saidi
4d3afeb37b added vim swap files 2014-07-16 15:59:16 +00:00
Lucas Ou-Yang
6a0a365694 fixed formatting bug in docs, this is terrible 2014-06-17 03:34:54 -07:00