# Quick Start Guide ## Prerequisites - Python 3.8 or higher - Docker Desktop - 8GB RAM minimum (16GB recommended) - Windows, macOS, or Linux ## Installation ### Windows ```powershell # Run in PowerShell as Administrator .\setup.ps1 ``` ### Linux/Mac ```bash chmod +x setup.sh ./setup.sh ``` ## Manual Installation 1. **Create virtual environment** ```bash python -m venv venv # Windows venv\Scripts\activate # Linux/Mac source venv/bin/activate ``` 2. **Install dependencies** ```bash pip install -r requirements.txt ``` 3. **Start Neo4j** ```bash docker-compose up -d ``` 4. **Run application** ```bash python run.py ``` ## First Time Usage 1. Open browser to http://localhost:5000 2. The database will auto-initialize with sample data 3. Explore the dashboard tabs: - **Dashboard**: Overview statistics - **Neo4j Visualization**: Interactive graph - **BOINC Tasks**: Distributed computing - **GDC Data**: Cancer genomics data - **Analysis Pipeline**: Bioinformatics tools ## GraphQL Queries Access GraphQL playground at: http://localhost:5000/graphql Example queries: ```graphql # Get all genes query { genes(limit: 10) { symbol name chromosome } } # Get mutations for a gene query { mutations(gene: "TP53") { chromosome position consequence } } # Get patients with cancer type query { patients(project_id: "TCGA-BRCA") { patient_id age gender } } ``` ## API Examples ### Submit BOINC Task ```bash curl -X POST http://localhost:5000/api/boinc/submit \ -H "Content-Type: application/json" \ -d '{"workunit_type": "variant_calling", "input_file": "sample.fastq"}' ``` ### Get Database Summary ```bash curl http://localhost:5000/api/neo4j/summary ``` ### Search GDC Files ```bash curl http://localhost:5000/api/gdc/files/TCGA-BRCA?limit=10 ``` ## Troubleshooting ### Docker not starting ```bash # Check Docker status docker ps # Restart Docker containers docker-compose down docker-compose up -d ``` ### Neo4j connection error 1. Wait 30 seconds for Neo4j to fully start 2. Check Neo4j Browser: http://localhost:7474 3. Login: username=neo4j, password=cancer123 ### Python module errors ```bash # Reinstall dependencies pip install --upgrade -r requirements.txt ``` ## Configuration Edit `config.yml` to customize: - Neo4j connection - GDC API settings - BOINC configuration - Pipeline parameters ## Data Sources ### GDC Portal Projects - TCGA-BRCA: Breast Cancer - TCGA-LUAD: Lung Adenocarcinoma - TCGA-COAD: Colon Adenocarcinoma - TCGA-GBM: Glioblastoma - TARGET-AML: Acute Myeloid Leukemia ### Sample Data The system includes sample data for demonstration: - 7 cancer-associated genes (TP53, BRAF, BRCA1, BRCA2, etc.) - 5 mutation records - 5 patient cases - 4 cancer types ## Development ### Run tests ```bash pytest ``` ### Format code ```bash black backend/ ``` ### API Documentation http://localhost:5000/docs (Swagger UI) ## Support For issues or questions: - Check logs: `logs/cancer_at_home.log` - Review configuration: `config.yml` - Consult README.md for detailed information