浏览代码

ECE implementation

Nicholas Schense 3 月之前
父节点
当前提交
c8493d9564
共有 6 个文件被更改,包括 41 次插入7 次删除
  1. 1 1
      .gitignore
  2. 4 4
      config.toml
  3. 23 1
      daily_log.md
  4. 2 0
      planning.md
  5. 10 0
      threshold.py
  6. 1 1
      utils/metrics.py

+ 1 - 1
.gitignore

@@ -1,6 +1,6 @@
 #Custom gitignore
 saved_models/
-nohup.out
+nohup.out 
 
 
 # Byte-compiled / optimized / DLL files

+ 4 - 4
config.toml

@@ -7,8 +7,8 @@ model_output = '/export/home/nschense/alzheimers/alzheimers_nn/saved_models/'
 
 [training]
 device = 'cuda:1'
-runs = 30
-max_epochs = 10
+runs = 100
+max_epochs = 30
 
 [dataset]
 validation_split = 0.4 #Splits the dataset into the train and validation/test set, 50% each for validation and test
@@ -16,7 +16,7 @@ validation_split = 0.4 #Splits the dataset into the train and validation/test se
 #|splt*0.5  | split*0.5      | 1-split   |
 
 [model]
-name = 'cnn-30x10'
+name = 'cnn-100x30-2'
 image_channels = 1
 clin_data_channels = 2
 
@@ -29,5 +29,5 @@ droprate = 0.5
 silent = false
 
 [ensemble]
-name = 'cnn-30x10'
+name = 'cnn-100x30-2'
 prune_threshold = 0.0 # Any models with accuracy below this threshold will be pruned, set to 0 to disable pruning

+ 23 - 1
daily_log.md

@@ -18,5 +18,27 @@ Relativly sedate day, mostly just rewriting the dataset system to ensure that lo
 - Investigate BNNs further
 
 ## Monday, June 17, 2024
-First day of Week 3!
+First day of Week 3! Pretty slow today as well, mainly working on ECE and figuring out why the coverage curves are so weird. Reading more about ensemble methods, going to try some calibration techniques. Overall, though, pretty good! Hope to be able to do some more after meeting with Ali and maybe Brayden, potentially look at some training methods (bagging, boosting etc.) and continue to investigate the weird coverage metrics.
+
+### Progress
+- Implemented ECE metric
+- Continued reading about uncertainty quantification
+- Looked into bagging/boosting implementations
+
+### Future
+- Continue to investigate low accuracy at high certainty
+- Continue reading
+- Meet with Ali Wednesday
+
+
+### Tuesday, June 18, 2024
+Slow day today, mostly continued with reading. Began training a new model with 100x30 runs/epochs, should be able to serve as base model for future work now that the data is deterministic
+
+### Progress
+- Continued to read about calibration and calibration errors
+- Investigated libraries for ensemble tranining
+
+### Future
+- Meet with Ali
+- Continue reading
   

+ 2 - 0
planning.md

@@ -1,12 +1,14 @@
 # Planning
 
 As of now, we have a program set up to be able to:
+
 - train an individual model with specific hyperparameters
 - train a ensemble of models with the identical hyperparameters 
 - evaluate the accuracy of an ensemble of models 
 - perform a coverage analysis on an ensemble of models 
 
 The goal of this rewrite is to preserve those functions while making the program significantly cleaner and easier to use, and to make it easier to extend with new functionality in the future as well. The hope is for this project to take approximately ~1-2 days, and be completed by Monday (6/17). The additional features that I would like to implement are:
+
 - Recording sets of models and ensembles and reporting their metrics (must be able to distinguish between individual models and ensembles)
 - Better configurating settings, able to use waterfalling files (files in subdirectories override settings in main directory)
 - Descriptive configuration (describe the desired models, ensembles and analysis in a config file and have that be produced on program run)

+ 10 - 0
threshold.py

@@ -8,6 +8,7 @@ import torch
 import matplotlib.pyplot as plt
 import sklearn.metrics as metrics
 from tqdm import tqdm
+import utils.metrics as met
 
 RUN = True
 
@@ -247,3 +248,12 @@ plt.ylabel('Number of incorrect predictions')
 plt.savefig(
     f"{config['paths']['model_output']}{config['ensemble']['name']}/incorrect_predictions.png"
 )
+
+ece = met.ECE(predictions['Prediction'], predictions['Actual'])
+
+print(f'ECE: {ece}')
+
+with open(
+    f"{config['paths']['model_output']}{config['ensemble']['name']}/summary.txt", 'a'
+) as f:
+    f.write(f'ECE: {ece}\n')

+ 1 - 1
utils/metrics.py

@@ -29,4 +29,4 @@ def ECE(samples, true_labels, M=5):
             avg_confid = confidences[in_bin].mean()
             ece += np.abs(avg_confid - accuracy_in_bin) * prob_in_bin
 
-        return ece
+    return ece[0]