Monday, May 4, 2015

A simple progress 'bar' for IPython Notebooks

Doing data science, I often start loop functions without a clear idea of how long they'll take. When working with exceptionally huge datasets, it can be hours. That's why I created this quick and dirty progress bar (okay, there's no bar, it's a counter to 100) so I can judge whether I should wait, get up to make some coffee, go do laundry, or spin up a large AWS instance to get the job done.

The Github repo is here. If you're wondering how I made an IPython notebook look somewhat pretty on a blog, I blogged about it here

 
# code for the progress bar

import time

class ProgressBar: 
    def __init__(self, loop_length):
        import time
        self.start = time.time()
        self.increment_size = 100.0/loop_length
        self.curr_count = 0
        self.curr_pct = 0
        self.overflow = False
        print('% complete: ', end='')
    
    def increment(self):
        self.curr_count += self.increment_size
        if int(self.curr_count) > self.curr_pct:
            self.curr_pct = int(self.curr_count)
            if self.curr_pct <= 100:
                print(self.curr_pct, end=' ')
            elif self.overflow == False:
                print("\n* Count has gone over 100%; likely either due to:\n  - an error in the loop_length specified when " + \
                      "progress_bar was instantiated\n  - an error in the placement of the increment() function")
                print('Elapsed time when progress bar full: {:0.1f} seconds.'.format(time.time() - self.start))
                self.overflow = True

    def finish(self):
        if 99 <= self.curr_pct <= 100: # rounding sometimes makes the maximum count 99.
            print("100", end=' ')
            print('\nElapsed time: {:0.1f} seconds.\n'.format(time.time() - self.start))
        elif self.overflow == True:
            print('Elapsed time after end of loop: {:0.1f} seconds.\n'.format(time.time() - self.start))
        else:
            print('\n* End of loop reached earlier than expected.\nElapsed time: {:0.1f} seconds.\n'.format(time.time() - self.start))
 
# normal usage, on my slow crappy laptop

loop_length = 1000000

pbar = ProgressBar(loop_length)
for i in range(loop_length):
    # your code goes here
    pbar.increment()
pbar.finish()
% complete: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 100 
Elapsed time: 1.7 seconds.

 
# here's what happens if the loop lengths are mismatched so that
# the progress bar expects fewer iterations than there are

loop_length = 1000000

pbar = ProgressBar(loop_length/2)
for i in range(loop_length):
    # your code goes here
    pbar.increment()
pbar.finish()
% complete: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 
* Count has gone over 100%; likely either due to:
  - an error in the loop_length specified when progress_bar was instantiated
  - an error in the placement of the increment() function
Elapsed time when progress bar full: 0.8 seconds.
Elapsed time after end of loop: 1.5 seconds.

 
# here's what happens if the loop lengths are mismatched so that
# the progress bar expects more iterations than there are

loop_length = 1000000

pbar = ProgressBar(loop_length*2)
for i in range(loop_length):
    # your code goes here
    pbar.increment()
pbar.finish()
% complete: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 
* End of loop reached earlier than expected.
Elapsed time: 1.6 seconds.

• • •

0 comments:

Post a Comment