Browse Source

ECE implementation

Nicholas Schense 3 months ago
parent
commit
c8493d9564
6 changed files with 41 additions and 7 deletions
  1. 1 1
      .gitignore
  2. 4 4
      config.toml
  3. 23 1
      daily_log.md
  4. 2 0
      planning.md
  5. 10 0
      threshold.py
  6. 1 1
      utils/metrics.py

+ 1 - 1
.gitignore

@@ -1,6 +1,6 @@
 #Custom gitignore
 #Custom gitignore
 saved_models/
 saved_models/
-nohup.out
+nohup.out 
 
 
 
 
 # Byte-compiled / optimized / DLL files
 # Byte-compiled / optimized / DLL files

+ 4 - 4
config.toml

@@ -7,8 +7,8 @@ model_output = '/export/home/nschense/alzheimers/alzheimers_nn/saved_models/'
 
 
 [training]
 [training]
 device = 'cuda:1'
 device = 'cuda:1'
-runs = 30
-max_epochs = 10
+runs = 100
+max_epochs = 30
 
 
 [dataset]
 [dataset]
 validation_split = 0.4 #Splits the dataset into the train and validation/test set, 50% each for validation and test
 validation_split = 0.4 #Splits the dataset into the train and validation/test set, 50% each for validation and test
@@ -16,7 +16,7 @@ validation_split = 0.4 #Splits the dataset into the train and validation/test se
 #|splt*0.5  | split*0.5      | 1-split   |
 #|splt*0.5  | split*0.5      | 1-split   |
 
 
 [model]
 [model]
-name = 'cnn-30x10'
+name = 'cnn-100x30-2'
 image_channels = 1
 image_channels = 1
 clin_data_channels = 2
 clin_data_channels = 2
 
 
@@ -29,5 +29,5 @@ droprate = 0.5
 silent = false
 silent = false
 
 
 [ensemble]
 [ensemble]
-name = 'cnn-30x10'
+name = 'cnn-100x30-2'
 prune_threshold = 0.0 # Any models with accuracy below this threshold will be pruned, set to 0 to disable pruning
 prune_threshold = 0.0 # Any models with accuracy below this threshold will be pruned, set to 0 to disable pruning

+ 23 - 1
daily_log.md

@@ -18,5 +18,27 @@ Relativly sedate day, mostly just rewriting the dataset system to ensure that lo
 - Investigate BNNs further
 - Investigate BNNs further
 
 
 ## Monday, June 17, 2024
 ## Monday, June 17, 2024
-First day of Week 3!
+First day of Week 3! Pretty slow today as well, mainly working on ECE and figuring out why the coverage curves are so weird. Reading more about ensemble methods, going to try some calibration techniques. Overall, though, pretty good! Hope to be able to do some more after meeting with Ali and maybe Brayden, potentially look at some training methods (bagging, boosting etc.) and continue to investigate the weird coverage metrics.
+
+### Progress
+- Implemented ECE metric
+- Continued reading about uncertainty quantification
+- Looked into bagging/boosting implementations
+
+### Future
+- Continue to investigate low accuracy at high certainty
+- Continue reading
+- Meet with Ali Wednesday
+
+
+### Tuesday, June 18, 2024
+Slow day today, mostly continued with reading. Began training a new model with 100x30 runs/epochs, should be able to serve as base model for future work now that the data is deterministic
+
+### Progress
+- Continued to read about calibration and calibration errors
+- Investigated libraries for ensemble tranining
+
+### Future
+- Meet with Ali
+- Continue reading
   
   

+ 2 - 0
planning.md

@@ -1,12 +1,14 @@
 # Planning
 # Planning
 
 
 As of now, we have a program set up to be able to:
 As of now, we have a program set up to be able to:
+
 - train an individual model with specific hyperparameters
 - train an individual model with specific hyperparameters
 - train a ensemble of models with the identical hyperparameters 
 - train a ensemble of models with the identical hyperparameters 
 - evaluate the accuracy of an ensemble of models 
 - evaluate the accuracy of an ensemble of models 
 - perform a coverage analysis on an ensemble of models 
 - perform a coverage analysis on an ensemble of models 
 
 
 The goal of this rewrite is to preserve those functions while making the program significantly cleaner and easier to use, and to make it easier to extend with new functionality in the future as well. The hope is for this project to take approximately ~1-2 days, and be completed by Monday (6/17). The additional features that I would like to implement are:
 The goal of this rewrite is to preserve those functions while making the program significantly cleaner and easier to use, and to make it easier to extend with new functionality in the future as well. The hope is for this project to take approximately ~1-2 days, and be completed by Monday (6/17). The additional features that I would like to implement are:
+
 - Recording sets of models and ensembles and reporting their metrics (must be able to distinguish between individual models and ensembles)
 - Recording sets of models and ensembles and reporting their metrics (must be able to distinguish between individual models and ensembles)
 - Better configurating settings, able to use waterfalling files (files in subdirectories override settings in main directory)
 - Better configurating settings, able to use waterfalling files (files in subdirectories override settings in main directory)
 - Descriptive configuration (describe the desired models, ensembles and analysis in a config file and have that be produced on program run)
 - Descriptive configuration (describe the desired models, ensembles and analysis in a config file and have that be produced on program run)

+ 10 - 0
threshold.py

@@ -8,6 +8,7 @@ import torch
 import matplotlib.pyplot as plt
 import matplotlib.pyplot as plt
 import sklearn.metrics as metrics
 import sklearn.metrics as metrics
 from tqdm import tqdm
 from tqdm import tqdm
+import utils.metrics as met
 
 
 RUN = True
 RUN = True
 
 
@@ -247,3 +248,12 @@ plt.ylabel('Number of incorrect predictions')
 plt.savefig(
 plt.savefig(
     f"{config['paths']['model_output']}{config['ensemble']['name']}/incorrect_predictions.png"
     f"{config['paths']['model_output']}{config['ensemble']['name']}/incorrect_predictions.png"
 )
 )
+
+ece = met.ECE(predictions['Prediction'], predictions['Actual'])
+
+print(f'ECE: {ece}')
+
+with open(
+    f"{config['paths']['model_output']}{config['ensemble']['name']}/summary.txt", 'a'
+) as f:
+    f.write(f'ECE: {ece}\n')

+ 1 - 1
utils/metrics.py

@@ -29,4 +29,4 @@ def ECE(samples, true_labels, M=5):
             avg_confid = confidences[in_bin].mean()
             avg_confid = confidences[in_bin].mean()
             ece += np.abs(avg_confid - accuracy_in_bin) * prob_in_bin
             ece += np.abs(avg_confid - accuracy_in_bin) * prob_in_bin
 
 
-        return ece
+    return ece[0]