Added code to fetch based on hydrated account status by bechbd · Pull Request #71 · TheDataRideAlongs/ProjectDomino

bechbd · 2020-04-27T02:23:31Z

No description provided.

lmeyerov · 2020-05-03T17:13:23Z

+from typing import Optional
+from functools import partial
+import logging
+logger = logging.getLogger('ds')


Non-blocker, for next time gardening: python world seems to be moving to tighter forms like (a) full-module imports all on one line (alphabetical) and (b) explicit imports per-line

lmeyerov · 2020-05-03T17:15:41Z

+        self.url_drugbank: str = os.environ["URL_DRUGBANK"] if isinstance(
+            os.environ["URL_DRUGBANK"], str) else None
+        self.query_keywords: [] = os.environ["QUERY_KEYWORDS"].split(
+            ",") if isinstance(os.environ["QUERY_KEYWORDS"], str) else None


Was a little surprised to see configfile / envvar stuff here vs. regular parameters (ex: testability), but ok!

lmeyerov · 2020-05-03T17:16:49Z

+        return response.json()
+
+    def api_wrapper(self, query, from_study):
+        return self.api(query, from_study, from_study+99, self.url_USA)


+99 some sort of interval len - make an optional param ..., span=99)?

lmeyerov · 2020-05-03T17:17:54Z

+        return self.api(query, from_study, from_study+99, self.url_USA)
+
+    def getAllStudiesByQuery(self, query: str) -> list:
+        logger.info("> STARTING scraping with '{}' keyword".format(query))


When using logger, instead of eagerly interpolating, better to use %s, so interpolation only executes when in that log level:

logger.info("zzzz %s", query)

lmeyerov · 2020-05-03T17:19:07Z

+    def api(query, from_study, to_study, url):
+        url = url.format(query, from_study, to_study)
+        response = requests.request("GET", url)
+        return response.json()


error handler?

lmeyerov · 2020-05-03T17:20:07Z

+            "> {} studies found by '{}' keyword".format(nstudies, query))
+        if nstudies > 0:
+            studies = temp['FullStudiesResponse']['FullStudies']
+            for study_index in range(from_study+100, nstudies, 100):


Again, good to have stride as a parameter w/ default value, ..., span=100)

lmeyerov · 2020-05-03T17:23:38Z

+        return studies
+
+    @staticmethod
+    def xls_handler(r):


AFAICT, this downloads an xls, rewrites into csv, and rereads the csv w/ pandas, and returns the result

Pandas has an xls reader, and may be able to work directly on the bytes buffer. A bit surprising to see like this, and I recall discussion of some formatting issues encountered along the way.

(not urgent if works, to be clear)

lmeyerov · 2020-05-03T17:25:08Z

+                session.run(traversal, users=user_data)
+            cls.ids = pd.DataFrame({'id': [1, 2, 3, 4, 5]})
+        except Exception as err:
+            print(err)


Double check the error handling... e.g., should we detect & rerun if error?

lmeyerov

Seems fine-ish to land:

-- Now: I'd take a think on exception handling, esp. as this gets plugging into automation

-- Next time: Some stylistic stuff to keep in mind, see comments

Added code to fetch based on hydrated account status

72bf770

bechbd requested review from bmorphism and lmeyerov April 27, 2020 02:23

lmeyerov reviewed May 3, 2020

View reviewed changes

lmeyerov approved these changes May 3, 2020

View reviewed changes

webcoderz merged commit b28dd2d into master Aug 18, 2020

webcoderz deleted the issue-70 branch August 18, 2020 20:08

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Added code to fetch based on hydrated account status#71

Added code to fetch based on hydrated account status#71
webcoderz merged 1 commit intomasterfrom
issue-70

bechbd commented Apr 27, 2020

lmeyerov May 3, 2020

lmeyerov May 3, 2020

lmeyerov May 3, 2020 •

edited

Loading

lmeyerov May 3, 2020 •

edited

Loading

lmeyerov May 3, 2020

lmeyerov May 3, 2020

lmeyerov May 3, 2020

lmeyerov May 3, 2020

lmeyerov May 3, 2020

lmeyerov left a comment

Labels

3 participants

Conversation

bechbd commented Apr 27, 2020

lmeyerov May 3, 2020

Choose a reason for hiding this comment

lmeyerov May 3, 2020

Choose a reason for hiding this comment

lmeyerov May 3, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

lmeyerov May 3, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

lmeyerov May 3, 2020

Choose a reason for hiding this comment

lmeyerov May 3, 2020

Choose a reason for hiding this comment

lmeyerov May 3, 2020

Choose a reason for hiding this comment

lmeyerov May 3, 2020

Choose a reason for hiding this comment

lmeyerov May 3, 2020

Choose a reason for hiding this comment

lmeyerov left a comment

Choose a reason for hiding this comment

Labels

3 participants

lmeyerov May 3, 2020 •

edited

Loading

lmeyerov May 3, 2020 •

edited

Loading