Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stderr logging - cleanup only #136

Merged
merged 3 commits into from
Apr 6, 2024
Merged

Conversation

MatthewRalston
Copy link
Owner

This does not close #132 entirely. The new interface is also not addressed. #133 not addressed.

…h PyPI/pip and conda using both pyproject.toml (pip install .) and python setup.py install
…l release. Doing the appmap schemas in parallel. Addressing issue #132, issue #134, issue #135.
Looking towards enhancements of UI, because clear communication, is clear communication.

EDITMSG:
Removes deprecated parameters in graph/parse and init, issue #135. Working on stderr logging and interface revisions (issue #132) to the the Anvar assignment.

-------

We have finally added a graph layer to the original parameter reduction space, distance matrix, and vis, through clustering graphics that accompany the kmeans and hierarchical cmds.

This is unsquashed commite, true. I just need the workspace clear for my own head. I am struggling personally, so output is relatively simple right now.

Note: uh im sorry, like i think i'll need a lot of metadata on the key relations in the assembly 'story'. There will be backtracking and weight optimizations, local maxima and other problems galore. How?????????? Well... essentially, part of it is the

 I'd like to understand the extent to which sequences are graphical naturally, and where i want the deBruijn graph algorithm to make sensible decisions, for the sake of brevity to the solution, and then compare with other algorithms in an honest way. 'kmerdb assemble .kdb .kdbg' is likely the next step, but i am wondering about gpu implementation or networkx shortcuts, or just maybe a Cython implementation of breadth first search? I think that would show the most, but who is my audience, honestly? The manager trying to assure my chops are good? Or am I trying to solve some problems via software. AFAICT, assembling 'genomes'...*is* one thing. Contigs, merging, bleh/voile, linear assembly. But there's no guarantee that the contigs are correct unless you're using specialized sequencing (i.e. mixed NGS and long-read sequencing, hi depth or single cell). Anyway if i would have guessed, sequence compression ratios make indexable on disk databases still an attractive option.

Option 1. [ x ] matt gets to implement breadth first search on cython and then potentially through a cuda kernel.
Option 2. [ x ] matt takes networkx/cugraph shortcuts to reach a viable assembler sooner.
Option 3. [ x ] matt's always putting off the biorxiv draft, and thats because of personal shame. i've... never written a grant or that many project applications with best effort and confidence, because the business side of science is not even an option. If you want to be good and have skill in the sciences, its as much in the lab as at a desk. You of course thrive for your conclusions to have weight, hold against experimental control angle and cubic designs, and you also want to have the question drive *you*. To be in the field and thriving, letting the curiosity drive the quality of the proliferation etc. and divorce that from coin. If I were in school I'd talk about two public-good projects, one or two businesses, and several ideas for maybe some ascii-to-midi projects I'd like to do on my old ipod library.

So yeah, then there's my dayjob. . Anyways

But then there's this llc thing. Private business for running rnaseq, stripe mediated processing... uh, ?showmanship like... lol... what do you even do to elaborate on offerings for some selection of pipelines, than by licensing a *scalable* solution and just work hard on the pipelines until the desired clientbase is covered, then go sample in the great outdoors, grind day job (maybe a lab pos by then), and wait for clientbase to grow.

The timeframe for that really depends on log aggregation and a few other technical barriers. Really appreciate the minikube team, and i've just had some 'not the right time' barriers to deploying a functional rnaseq pipeline in the timeframe i want using the available cloud vendor. so i decided to let the adjacent toolchains to develop out, portainer proxmod truenas scale and other technologies... to deploy on my workstation. The focus on the core features before testing migration.

The other issue is how much to parameterize and how to condense and user-ify the suggestions from the literature and their relevance on heuristics, parameter ranges, etc. Applies to both the kmerdb assemble command (which isnt implemented yet 3/5/24) as well as the future offerings in the data processing dept.

Sure, I'm interested in sequencing by synthesis and pore-mediated 3rd gen sequencing. If NGS is second gen, then by now we have 3rd gen long read metagenomics, metatranscriptomics, as well as traditional 2nd gen NGS as complementary methods for what sequencing offers researchers...I'd like to make an offering at somepoint for this group. Lots first tho

The contrasts between deep and shallow or mixed sequencing methods intrigues me for a similar way today, and my interest stems from k-mer origin and depth at key areas within existing assemblies/contigs and integrating additional/available datasets.

(identifying and describing contigs requires knowledge of deBruijn graphs. how you solve a graph depends on what 'cases' or conditions of comparison and discrimination in potential paths, and a consistent application of those criteria to establish outcomes.)

Single cell sequencing is a very established technology for looking at distinct gene signature series especially for expression profiling. That said, particular questions regarding fidelity and statistical methodologies, require a bit of normal skepticism, varieties of tools in your problem solving approach, the right dataset that tells the right story... an example, perhaps a simulated dataset involving rare variants that is self explanatory by assessment, and a healthy diet for stat. transcriptomics blogs. Of course... I only mention it as a potential

Back to kmerdb,
The problem now involves the edge-list traversal approach, which components might be delegated, and to what extent new code/tooling is needed.

If the future of genomes and gene knowledge is the non-linear genome, both in terms of linkages and physical structure, then the future of sequencing data should also be graphical.

Please address complaints on consistency to your nearest customer service representative, as my patience is thin.

So if i gotta talk about it...

well, where do we go from here?

half the country screaming bout country needs jesus but oh wait we're insecure white milquetoast middle (working class) class americuns. we like the thug and financially illiterate business that Trump iconizes some ideal of the business vc darling child, instead of the real health worker, the real epedimiologists, pathologists, infectious disease experts, those at the front lines of humanitys battle against the biology and survival here... not off in the cosmos like some of those on wallstreet, and hollywood would have us dream.

the dream was here.

and trump put us here.
@MatthewRalston MatthewRalston added bug Something isn't working documentation Improvements or additions to documentation wontfix This will not be worked on dependencies Pull requests that update a dependency file labels Apr 6, 2024
@MatthewRalston MatthewRalston added this to the Interface Revision milestone Apr 6, 2024
@MatthewRalston MatthewRalston self-assigned this Apr 6, 2024
@MatthewRalston MatthewRalston merged commit c7db249 into master Apr 6, 2024
@MatthewRalston MatthewRalston deleted the stderr_logging_cleanup branch April 6, 2024 05:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working dependencies Pull requests that update a dependency file documentation Improvements or additions to documentation wontfix This will not be worked on
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

stderr logging and interface cleanup
1 participant