============================================================ 📂 Cargando dataset desde cache: d:\Github\KokoWorks\DataCamp\Hadaka\Solution\data\dataset_cache.pkl ✓ Cargado desde cache ============================================================ 📊 ANÁLISIS DE DISTRIBUCIÓN DE PROPORCIONES ============================================================ Tipo Media Std Min Max CV% ---------------------------------------------------- endo 0.1481 0.0821 0.0105 0.4303 55.5% fibro 0.4357 0.1067 0.1844 0.6887 24.5% immune 0.1061 0.0828 0.0022 0.5362 78.0% classic 0.2251 0.1268 0.0000 0.5094 56.3% basal 0.0850 0.0923 0.0000 0.3427 108.6% 🔍 Detección de problemas: basal: alta variabilidad Correlación entre tipos celulares: classic <-> basal: -0.669 📊 Dataset combinado: 75 muestras ============================================================ 🚀 Entrenando modelo XGBoost de deconvolución ============================================================ 📊 Features: 39,632 (15,908 RNA + 23,724 MET) 📊 Muestras de entrenamiento: 60 📊 Tipos celulares: ['endo', 'fibro', 'immune', 'classic', 'basal'] 🔧 Entrenando modelo para: endo RMSE (train): 0.0827 🔧 Entrenando modelo para: fibro RMSE (train): 0.0831 🔧 Entrenando modelo para: immune RMSE (train): 0.0840 🔧 Entrenando modelo para: classic RMSE (train): 0.0858 🔧 Entrenando modelo para: basal RMSE (train): 0.0742 ✅ Modelo entrenado exitosamente ============================================================ 📊 MÉTRICAS DE EVALUACIÓN ============================================================ RMSE global: 0.0741 MAE global: 0.0563 R² global: 0.7550 RMSE por tipo celular: endo: 0.0584 fibro: 0.0658 immune: 0.0735 classic: 0.0879 basal: 0.0811 ============================================================ 🔄 Validación Cruzada (10 folds) ============================================================ 🔄 Validación cruzada (10 folds)... Fold 1: RMSE = 0.0559 Fold 2: RMSE = 0.0824 Fold 3: RMSE = 0.0873 Fold 4: RMSE = 0.0987 Fold 5: RMSE = 0.0496 Fold 6: RMSE = 0.1091 Fold 7: RMSE = 0.0843 Fold 8: RMSE = 0.0984 Fold 9: RMSE = 0.0814 Fold 10: RMSE = 0.0566 📊 RMSE promedio: 0.0804 (±0.0191) ============================================================ 🔍 Top 20 Features más importantes ============================================================ feature importance 34471 met_cg21623671 0.100180 34000 met_cg21126707 0.099820 16770 met_cg00973677 0.032689 16350 met_cg00509616 0.026528 16951 met_cg01200060 0.024618 16005 met_cg00112517 0.024551 15909 met_cg00003994 0.023815 16775 met_cg00980978 0.023091 16171 met_cg00297584 0.022412 15983 met_cg00079563 0.022296 38179 met_cg25933726 0.019122 20180 met_cg04988423 0.018648 35878 met_cg23280339 0.017841 23541 met_cg09017174 0.017251 16263 met_cg00412772 0.017170 21402 met_cg06489418 0.016815 38720 met_cg26557658 0.016721 24297 met_cg09923855 0.016700 20590 met_cg05498681 0.016237 30706 met_cg17253459 0.014604 💾 Guardando modelo en: d:\Github\KokoWorks\DataCamp\Hadaka\Solution\models\xgb_deconvolution_model.pkl ============================================================ ✅ Proceso completado ⏱️ Tiempo de ejecución: 6m 50.18s ============================================================