Skip to content

Added code to fetch based on hydrated account status#71

Merged
webcoderz merged 1 commit intomasterfrom
issue-70
Aug 18, 2020
Merged

Added code to fetch based on hydrated account status#71
webcoderz merged 1 commit intomasterfrom
issue-70

Conversation

@bechbd
Copy link
Copy Markdown
Collaborator

@bechbd bechbd commented Apr 27, 2020

No description provided.

@bechbd bechbd requested review from bmorphism and lmeyerov April 27, 2020 02:23
from typing import Optional
from functools import partial
import logging
logger = logging.getLogger('ds')
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Non-blocker, for next time gardening: python world seems to be moving to tighter forms like (a) full-module imports all on one line (alphabetical) and (b) explicit imports per-line

self.url_drugbank: str = os.environ["URL_DRUGBANK"] if isinstance(
os.environ["URL_DRUGBANK"], str) else None
self.query_keywords: [] = os.environ["QUERY_KEYWORDS"].split(
",") if isinstance(os.environ["QUERY_KEYWORDS"], str) else None
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was a little surprised to see configfile / envvar stuff here vs. regular parameters (ex: testability), but ok!

return response.json()

def api_wrapper(self, query, from_study):
return self.api(query, from_study, from_study+99, self.url_USA)
Copy link
Copy Markdown
Contributor

@lmeyerov lmeyerov May 3, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+99 some sort of interval len - make an optional param ..., span=99)?

return self.api(query, from_study, from_study+99, self.url_USA)

def getAllStudiesByQuery(self, query: str) -> list:
logger.info("> STARTING scraping with '{}' keyword".format(query))
Copy link
Copy Markdown
Contributor

@lmeyerov lmeyerov May 3, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When using logger, instead of eagerly interpolating, better to use %s, so interpolation only executes when in that log level:

logger.info("zzzz %s", query)
def api(query, from_study, to_study, url):
url = url.format(query, from_study, to_study)
response = requests.request("GET", url)
return response.json()
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

error handler?

"> {} studies found by '{}' keyword".format(nstudies, query))
if nstudies > 0:
studies = temp['FullStudiesResponse']['FullStudies']
for study_index in range(from_study+100, nstudies, 100):
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, good to have stride as a parameter w/ default value, ..., span=100)

return studies

@staticmethod
def xls_handler(r):
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AFAICT, this downloads an xls, rewrites into csv, and rereads the csv w/ pandas, and returns the result

Pandas has an xls reader, and may be able to work directly on the bytes buffer. A bit surprising to see like this, and I recall discussion of some formatting issues encountered along the way.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(not urgent if works, to be clear)

session.run(traversal, users=user_data)
cls.ids = pd.DataFrame({'id': [1, 2, 3, 4, 5]})
except Exception as err:
print(err)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Double check the error handling... e.g., should we detect & rerun if error?

Copy link
Copy Markdown
Contributor

@lmeyerov lmeyerov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems fine-ish to land:

-- Now: I'd take a think on exception handling, esp. as this gets plugging into automation

-- Next time: Some stylistic stuff to keep in mind, see comments

@webcoderz webcoderz merged commit b28dd2d into master Aug 18, 2020
@webcoderz webcoderz deleted the issue-70 branch August 18, 2020 20:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

3 participants