Lucas Ou-Yang
b9dac8d1fb
Revamped all unit tests so that every request is mocked (unit tests will now run for everyone!)
...
Deleted a bunch of unused stray data files under /tests/data.
2014-10-12 16:08:01 -07:00
Lucas Ou-Yang
4e31fc3124
Update version from 0.0.7 to 0.0.8
2014-10-12 14:13:29 -07:00
Lucas Ou-Yang
6da4fa96e3
Remove unneeded files
2014-10-12 14:10:07 -07:00
Igor Shevchenko
6aa85d835e
Add slash splitter
2014-08-15 21:38:42 +06:00
Lucas Ou-Yang
d6d16134d0
[bugfix] Fix wrong equality comparision
...
Reference this fix in the repo of a library we used to be dependent on: https://github.com/grangier/python-goose/pull/114
2014-08-05 19:33:02 -07:00
Lucas Ou-Yang
3053ea2720
[bugfix] Removed bad reference in setup.py
...
`Setup.py` used to require a package directory named
`data`, this directory was removed in a previous
commit.
2014-08-05 18:56:37 -07:00
Lucas Ou
25daa2b679
Merge pull request #71 from codelucas/huge-refactor-codelucas
...
Huge refactor: entire codebase in PEP8, imports alphabetized, bugfixes, core changes
2014-08-05 00:12:38 -07:00
Lucas Ou-Yang
81329a138a
Merge branch 'huge-refactor-codelucas' of https://github.com/codelucas/newspaper into huge-refactor-codelucas
...
Conflicts:
newspaper/extractors.py
2014-08-03 22:22:08 -07:00
Lucas Ou-Yang
9db8a96c0e
[refactor] Methods from extractors, formatters, cleaners now take item params, not article or source objects
2014-08-03 22:19:42 -07:00
Lucas Ou-Yang
554ec654fc
[refactor] Methods from extractors, formatters, cleaners now take item params, not article or source objects
2014-08-03 22:10:36 -07:00
Lucas Ou-Yang
dd577abdae
[refactor] Adhere entire codebase to pep8
2014-08-03 20:12:00 -07:00
Lucas Ou-Yang
b436d38883
[bugfix] Forgot to init logger object in extractors.py
2014-08-03 16:06:51 -07:00
Lucas Ou-Yang
43e6ebbc09
[bugfix] Broken feed extraction, b/c of unescaped non-regex chars
2014-08-03 16:06:22 -07:00
Lucas Ou-Yang
4042badd63
[refactor] Alphabetized imports, structured according to pep8
...
Also fixed spacing between imports and body code if needed.
2014-08-03 16:00:23 -07:00
Lucas Ou-Yang
be4c9fc99f
Merge branch 'master' of https://github.com/codelucas/newspaper
2014-08-03 11:05:48 -07:00
Lucas Ou-Yang
9bbbeaa69b
Merge branch 'master' of https://github.com/codelucas/newspaper
2014-08-03 11:02:08 -07:00
Lucas Ou-Yang
75883895b0
Merge branch 'master' of https://github.com/codelucas/newspaper
2014-08-03 10:57:29 -07:00
Lucas Ou-Yang
246ad93eaa
[Bugfix] Fallback to 'en' for bogus extracted meta langs
...
Also patched a bug where the `stopwords_class` was not updated
properly for when non-latin languages were extracted from
meta tags. Look at the updated unit tests.
2014-08-03 10:57:20 -07:00
Lucas Ou-Yang
0d812d81c7
[Bugfix] Fallback to 'en' for bogus extracted meta langs
...
Also patched a bug where the `stopwords_class` was not updated
properly for when non-latin languages were extracted from
meta tags. Look at the updated unit tests.
2014-08-03 10:44:04 -07:00
Lucas Ou-Yang
95299356cc
Refactor miscellaneous data files to more sensible location.
2014-08-01 01:07:23 -07:00
Lucas Ou-Yang
6dfcf7820e
Increased speed of NLP by 8x via set lookups vs list
2014-08-01 00:49:30 -07:00
Lucas Ou-Yang
c72f408e94
Refactor and added tests for meta data extraction
2014-07-31 22:43:00 -07:00
Lucas Ou
e8e99bc740
Merge pull request #69 from karls/meta-tag-extraction-fixes
...
Meta tag extraction fixes
2014-07-31 21:39:15 -07:00
Karl Sutt
bdda57137a
Fix meta extraction.
...
Meta tags were incorrectly extracted when the meta key was not in the
form of "foo:bar". The resulting value was, incorrectly, an empty dict.
2014-07-31 21:37:20 -07:00
Karl Sutt
ea824f89d2
Merge branch 'test-suite-improvements' into meta-tag-extraction-fixes
2014-07-31 10:02:48 +01:00
Lucas Ou
0066978337
Merge pull request #68 from karls/test-suite-improvements
...
Test suite improvements
2014-07-30 16:38:05 -07:00
Karl Sutt
1154dc4136
Update requirements
2014-07-30 19:54:34 +01:00
Karl Sutt
826b912aee
Uncomment a test
2014-07-29 22:00:20 +01:00
Lucas Ou
676b10a8bf
Merge pull request #67 from karls/test-suite-fixes
...
Test suite fixes
2014-07-29 13:53:32 -07:00
Karl Sutt
02dc291b2e
Fix meta extraction.
...
Meta tags were incorrectly extracted when the meta key was not in the
form of "foo:bar". The resulting value was, incorrectly, an empty dict.
2014-07-28 16:25:51 +01:00
Karl Sutt
02383cdd6f
Mock HTTP requests
...
Mock out any real HTTP requests made from within tests. Responses are
now deterministic, as the body of the response is loaded from a HTML file.
2014-07-28 13:55:57 +01:00
Lucas Ou
85a912d62a
Merge pull request #66 from codelucas/revert-63-master
...
Revert "Added published date to the extractor+article"
2014-07-28 00:23:03 -07:00
Lucas Ou
59b3cbea9d
Revert "Added published date to the extractor+article"
2014-07-28 00:22:42 -07:00
Lucas Ou
9179db51b0
Merge pull request #63 from skinnyp/master
...
Added published date to the extractor+article
2014-07-28 00:22:11 -07:00
Lucas Ou-Yang
6ab9ee5ec6
[bugfix] Add language display key for Norwegian (Bokmål)
2014-07-27 23:18:14 -07:00
Karl Sutt
6dc7b174b4
Lock package versions in requirements.txt
2014-07-24 10:55:43 +01:00
Karl Sutt
0d02da4d50
✂️ whitespace EOF
2014-07-23 17:27:15 +01:00
Karl Sutt
1973a9bd53
CNN's description has changed. Fix test accordingly
2014-07-23 17:26:46 +01:00
Karl Sutt
bf7437b27a
Download and parse article before doing any NLP
2014-07-23 17:26:09 +01:00
Karl Sutt
f5e6f19ff8
Download before parsing. Fix number of images assertion
2014-07-23 17:25:16 +01:00
Karl Sutt
634f0990de
Fix threading tests -- config passed incorrectly
2014-07-23 17:24:10 +01:00
Karl Sutt
94d51bdf08
Use build() instead of download() and parse()
2014-07-23 17:23:44 +01:00
Karl Sutt
223447670c
Move setting up the test to setUp() (otherwise fails)
2014-07-23 17:21:35 +01:00
Karl Sutt
2f201adbc7
Meta tag extraction fix (need to build article first)
2014-07-23 17:19:59 +01:00
Karl Sutt
69b00c8554
Fix a multilingual test failure (due to newline EOF)
2014-07-23 17:18:15 +01:00
Karl Sutt
6ef0d28445
Remove stray newlines at the end of files
2014-07-23 17:17:19 +01:00
Karl Sutt
20796fcc6a
Make tests/ directory a package.
2014-07-23 17:16:02 +01:00
Parham Saidi
827b8a014e
added published date to the extractor+article
2014-07-16 16:42:53 +00:00
Parham Saidi
4d3afeb37b
added vim swap files
2014-07-16 15:59:16 +00:00
Lucas Ou-Yang
6a0a365694
fixed formatting bug in docs, this is terrible
2014-06-17 03:34:54 -07:00