GPU Kernel Information Aggregated by Layer
layer_index | layer_name | layer_type | layer_duration (us) | layer_gpu_duration (us) | layer_cpu_duration (us) | layer_flops | layer_dram_read_bytes | layer_dram_write_bytes | layer_achieved_occupancy (%) | layer_arithmetic_intensity (flops/byte) | layer_arithmetic_throughput (GFlops) | layer_memory_bound |
---|
layer_index | layer_name | layer_type | layer_duration (us) | layer_gpu_duration (us) | layer_cpu_duration (us) | layer_flops | layer_dram_read_bytes | layer_dram_write_bytes | layer_achieved_occupancy (%) | layer_arithmetic_intensity (flops/byte) | layer_arithmetic_throughput (GFlops) | layer_memory_bound |
---|---|---|---|---|---|---|---|---|---|---|---|---|
1 | vgg4_relu0_fwd | Activation | 585.00 | 49.00 | 536.00 | 0 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | true |
2 | vgg4_conv1_fwd | Convolution | 241222.67 | 301.33 | 240921.33 | 0 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | true |
3 | vgg4_relu1_fwd | Activation | 588.33 | 49.00 | 539.33 | 0 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | true |
4 | vgg4_pool0_fwd | Pooling | 6438.67 | 33.33 | 6405.33 | 0 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | true |
5 | vgg4_conv2_fwd | Convolution | 100122.00 | 159.67 | 99962.33 | 0 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | true |
6 | vgg4_relu2_fwd | Activation | 301.67 | 19.67 | 282.00 | 0 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | true |
7 | vgg4_conv3_fwd | Convolution | 199591.33 | 260.00 | 199331.33 | 0 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | true |
8 | vgg4_relu3_fwd | Activation | 291.67 | 20.00 | 271.67 | 0 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | true |
9 | vgg4_pool1_fwd | Pooling | 3542.67 | 12.67 | 3530.00 | 0 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | true |
10 | vgg4_conv4_fwd | Convolution | 91360.00 | 170.67 | 91189.33 | 0 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | true |
11 | vgg4_relu4_fwd | Activation | 148.33 | 6.00 | 142.33 | 0 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | true |
12 | vgg4_conv5_fwd | Convolution | 186162.33 | 307.33 | 185855.00 | 0 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | true |
13 | vgg4_relu5_fwd | Activation | 152.33 | 6.00 | 146.33 | 0 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | true |
14 | vgg4_conv6_fwd | Convolution | 183749.00 | 306.00 | 183443.00 | 0 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | true |
15 | vgg4_relu6_fwd | Activation | 152.33 | 6.00 | 146.33 | 0 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | true |
16 | vgg4_pool2_fwd | Pooling | 1802.67 | 6.33 | 1796.33 | 0 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | true |
17 | vgg4_conv7_fwd | Convolution | 87487.00 | 177.67 | 87309.33 | 0 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | true |
18 | vgg4_relu7_fwd | Activation | 79.33 | 4.00 | 75.33 | 0 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | true |
19 | vgg4_conv8_fwd | Convolution | 177488.33 | 332.33 | 177156.00 | 0 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | true |
20 | vgg4_relu8_fwd | Activation | 81.33 | 4.33 | 77.00 | 0 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | true |
21 | vgg4_conv9_fwd | Convolution | 178904.67 | 332.00 | 178572.67 | 0 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | true |
22 | vgg4_relu9_fwd | Activation | 80.00 | 4.00 | 76.00 | 0 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | true |
23 | vgg4_pool3_fwd | Pooling | 890.33 | 4.00 | 886.33 | 0 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | true |
24 | vgg4_conv10_fwd | Convolution | 46756.67 | 178.33 | 46578.33 | 0 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | true |
25 | vgg4_relu10_fwd | Activation | 25.00 | 2.33 | 22.67 | 0 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | true |
26 | vgg4_conv11_fwd | Convolution | 46984.67 | 175.00 | 46809.67 | 0 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | true |
27 | vgg4_relu11_fwd | Activation | 27.00 | 3.00 | 24.00 | 0 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | true |
28 | vgg4_conv12_fwd | Convolution | 46893.33 | 175.00 | 46718.33 | 0 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | true |
29 | vgg4_relu12_fwd | Activation | 23.00 | 2.67 | 20.33 | 0 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | true |
30 | vgg4_pool4_fwd | Pooling | 286.67 | 4.00 | 282.67 | 0 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | true |
31 | vgg4_dense0_fwd | FullyConnected | 147936.67 | 705.67 | 147231.00 | 0 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | true |
32 | vgg4_dense0_relu_fwd | Activation | 14.67 | 2.00 | 12.67 | 0 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | true |
33 | vgg4_dropout0_fwd | Dropout | 7.67 | 2.00 | 5.67 | 0 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | true |
34 | vgg4_dense1_fwd | FullyConnected | 25785.33 | 120.67 | 25664.67 | 0 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | true |
35 | vgg4_dense1_relu_fwd | Activation | 11.00 | 2.67 | 8.33 | 0 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | true |
36 | vgg4_dropout1_fwd | Dropout | 7.00 | 2.00 | 5.00 | 0 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | true |
37 | vgg4_dense2_fwd | FullyConnected | 6593.33 | 34.67 | 6558.67 | 0 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | true |
Showing 1 to 37 of 37 entries