GPU Kernel Information Aggregated by Layer
layer_index | layer_name | layer_type | layer_duration (us) | layer_gpu_duration (us) | layer_cpu_duration (us) | layer_flops | layer_dram_read_bytes | layer_dram_write_bytes | layer_achieved_occupancy (%) | layer_arithmetic_intensity (flops/byte) | layer_arithmetic_throughput (GFlops) | layer_memory_bound |
---|
layer_index | layer_name | layer_type | layer_duration (us) | layer_gpu_duration (us) | layer_cpu_duration (us) | layer_flops | layer_dram_read_bytes | layer_dram_write_bytes | layer_achieved_occupancy (%) | layer_arithmetic_intensity (flops/byte) | layer_arithmetic_throughput (GFlops) | layer_memory_bound |
---|---|---|---|---|---|---|---|---|---|---|---|---|
1 | resnetv20_conv0_fwd | Convolution | 18797.67 | 133.67 | 18664.00 | 0 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | true |
2 | resnetv20_batchnorm1_fwd | BatchNorm | 166.67 | 49.67 | 117.00 | 0 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | true |
3 | resnetv20_relu0_fwd | Activation | 56.00 | 45.33 | 10.67 | 0 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | true |
4 | resnetv20_pool0_fwd | Pooling | 3994.00 | 52.00 | 3942.00 | 0 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | true |
5 | resnetv20_stage1_batchnorm0_fwd | BatchNorm | 142.00 | 10.33 | 131.67 | 0 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | true |
6 | resnetv20_stage1_activation0 | Activation | 21.00 | 6.00 | 15.00 | 0 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | true |
7 | resnetv20_stage1_conv0_fwd | Convolution | 18032.00 | 74.67 | 17957.33 | 0 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | true |
8 | resnetv20_stage1_batchnorm1_fwd | BatchNorm | 126.00 | 8.00 | 118.00 | 0 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | true |
9 | resnetv20_stage1_activation1 | Activation | 20.67 | 6.00 | 14.67 | 0 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | true |
10 | resnetv20_stage1_conv1_fwd | Convolution | 17993.00 | 58.67 | 17934.33 | 0 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | true |
11 | resnetv20_stage1__plus0 | elemwise_add | 119.33 | 10.00 | 109.33 | 0 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | true |
12 | resnetv20_stage1_batchnorm2_fwd | BatchNorm | 33.67 | 10.00 | 23.67 | 0 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | true |
13 | resnetv20_stage1_activation2 | Activation | 23.00 | 6.00 | 17.00 | 0 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | true |
14 | resnetv20_stage1_conv2_fwd | Convolution | 17997.33 | 59.33 | 17938.00 | 0 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | true |
15 | resnetv20_stage1_batchnorm3_fwd | BatchNorm | 123.33 | 8.00 | 115.33 | 0 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | true |
16 | resnetv20_stage1_activation3 | Activation | 22.33 | 6.00 | 16.33 | 0 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | true |
17 | resnetv20_stage1_conv3_fwd | Convolution | 17999.67 | 58.67 | 17941.00 | 0 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | true |
18 | resnetv20_stage1__plus1 | elemwise_add | 120.00 | 10.00 | 110.00 | 0 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | true |
19 | resnetv20_stage2_batchnorm0_fwd | BatchNorm | 25.67 | 8.00 | 17.67 | 0 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | true |
20 | resnetv20_stage2_activation0 | Activation | 22.33 | 6.00 | 16.33 | 0 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | true |
21 | resnetv20_stage2_conv0_fwd | Convolution | 8637.33 | 144.00 | 8493.33 | 0 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | true |
22 | resnetv20_stage2_batchnorm1_fwd | BatchNorm | 117.67 | 6.00 | 111.67 | 0 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | true |
23 | resnetv20_stage2_activation1 | Activation | 16.00 | 4.00 | 12.00 | 0 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | true |
24 | resnetv20_stage2_conv1_fwd | Convolution | 16159.00 | 80.67 | 16078.33 | 0 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | true |
25 | resnetv20_stage2_conv2_fwd | Convolution | 974.00 | 44.33 | 929.67 | 0 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | true |
26 | resnetv20_stage2__plus0 | elemwise_add | 114.67 | 5.00 | 109.67 | 0 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | true |
27 | resnetv20_stage2_batchnorm2_fwd | BatchNorm | 24.33 | 6.00 | 18.33 | 0 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | true |
28 | resnetv20_stage2_activation2 | Activation | 16.33 | 5.00 | 11.33 | 0 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | true |
29 | resnetv20_stage2_conv3_fwd | Convolution | 16147.33 | 82.00 | 16065.33 | 0 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | true |
30 | resnetv20_stage2_batchnorm3_fwd | BatchNorm | 120.00 | 6.00 | 114.00 | 0 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | true |
31 | resnetv20_stage2_activation3 | Activation | 16.67 | 4.67 | 12.00 | 0 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | true |
32 | resnetv20_stage2_conv4_fwd | Convolution | 16143.67 | 70.00 | 16073.67 | 0 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | true |
33 | resnetv20_stage2__plus1 | elemwise_add | 114.67 | 6.00 | 108.67 | 0 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | true |
34 | resnetv20_stage3_batchnorm0_fwd | BatchNorm | 25.00 | 6.00 | 19.00 | 0 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | true |
35 | resnetv20_stage3_activation0 | Activation | 15.00 | 4.33 | 10.67 | 0 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | true |
36 | resnetv20_stage3_conv0_fwd | Convolution | 8252.00 | 223.33 | 8028.67 | 0 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | true |
37 | resnetv20_stage3_batchnorm1_fwd | BatchNorm | 119.33 | 6.00 | 113.33 | 0 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | true |
38 | resnetv20_stage3_activation1 | Activation | 12.67 | 4.00 | 8.67 | 0 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | true |
39 | resnetv20_stage3_conv1_fwd | Convolution | 15538.67 | 133.67 | 15405.00 | 0 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | true |
40 | resnetv20_stage3_conv2_fwd | Convolution | 845.00 | 56.00 | 789.00 | 0 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | true |
41 | resnetv20_stage3__plus0 | elemwise_add | 114.00 | 4.00 | 110.00 | 0 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | true |
42 | resnetv20_stage3_batchnorm2_fwd | BatchNorm | 21.67 | 6.00 | 15.67 | 0 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | true |
43 | resnetv20_stage3_activation2 | Activation | 13.67 | 5.00 | 8.67 | 0 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | true |
44 | resnetv20_stage3_conv3_fwd | Convolution | 15450.33 | 136.00 | 15314.33 | 0 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | true |
45 | resnetv20_stage3_batchnorm3_fwd | BatchNorm | 119.33 | 5.67 | 113.67 | 0 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | true |
46 | resnetv20_stage3_activation3 | Activation | 14.00 | 4.00 | 10.00 | 0 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | true |
47 | resnetv20_stage3_conv4_fwd | Convolution | 15489.00 | 119.67 | 15369.33 | 0 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | true |
48 | resnetv20_stage3__plus1 | elemwise_add | 113.33 | 4.00 | 109.33 | 0 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | true |
49 | resnetv20_stage4_batchnorm0_fwd | BatchNorm | 22.00 | 6.00 | 16.00 | 0 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | true |
50 | resnetv20_stage4_activation0 | Activation | 13.00 | 4.00 | 9.00 | 0 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | true |
51 | resnetv20_stage4_conv0_fwd | Convolution | 9796.67 | 355.67 | 9441.00 | 0 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | true |
52 | resnetv20_stage4_batchnorm1_fwd | BatchNorm | 121.00 | 5.67 | 115.33 | 0 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | true |
53 | resnetv20_stage4_activation1 | Activation | 11.33 | 4.00 | 7.33 | 0 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | true |
54 | resnetv20_stage4_conv1_fwd | Convolution | 18875.33 | 359.00 | 18516.33 | 0 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | true |
55 | resnetv20_stage4_conv2_fwd | Convolution | 1013.33 | 82.00 | 931.33 | 0 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | true |
56 | resnetv20_stage4__plus0 | elemwise_add | 115.67 | 3.00 | 112.67 | 0 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | true |
57 | resnetv20_stage4_batchnorm2_fwd | BatchNorm | 21.00 | 5.00 | 16.00 | 0 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | true |
58 | resnetv20_stage4_activation2 | Activation | 12.00 | 4.00 | 8.00 | 0 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | true |
59 | resnetv20_stage4_conv3_fwd | Convolution | 18874.33 | 358.00 | 18516.33 | 0 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | true |
60 | resnetv20_stage4_batchnorm3_fwd | BatchNorm | 120.67 | 5.00 | 115.67 | 0 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | true |
61 | resnetv20_stage4_activation3 | Activation | 12.33 | 4.00 | 8.33 | 0 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | true |
62 | resnetv20_stage4_conv4_fwd | Convolution | 18894.00 | 341.00 | 18553.00 | 0 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | true |
63 | resnetv20_stage4__plus1 | elemwise_add | 119.00 | 3.67 | 115.33 | 0 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | true |
64 | resnetv20_batchnorm2_fwd | BatchNorm | 21.67 | 5.00 | 16.67 | 0 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | true |
65 | resnetv20_relu1_fwd | Activation | 12.00 | 4.00 | 8.00 | 0 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | true |
66 | resnetv20_pool1_fwd | Pooling | 77.67 | 18.00 | 59.67 | 0 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | true |
67 | resnetv20_flatten0_flatten0 | Flatten | 2.00 | 0.00 | 2.00 | 0 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | true |
68 | resnetv20_dense0_fwd | FullyConnected | 828.67 | 25.00 | 803.67 | 0 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | true |
Showing 1 to 68 of 68 entries