How to summarize very large neural networks?

Question

How to summarize very large neural networks?

Oskar

2022年5月29日 13:48

I am doing a lot of work with transfer learning at the moment (using keras and tensorflow if that is relevant). I am having a lot of issues in sufficiently summarizing the very large models.

This post: How do you visualize neural network architectures? shows a lot of useful methods for visualizing architectures, and they are great for networks such VGG16, but none of them are reasonable to include in a report if the models are very large (such as InceptionResNetV2 based networks).

My current approach is to simply include the depth, number of parameters and data like size and accuracy on the imagenet validation set. I would like to include more fine grained information however.

What I have tried: Several of the methods included in the post above. Exporting keras summarize to a table (issue here is that the tables for the largest networks will have over 700 rows, which just clutters the report so much in my opinion even if just included in the appendix).

So what I would like to know: do you have any recommendation for how to summarize very large network architectures in a way that is actually informative as to the inner workings without taking up 7 pages. It does not need to be in any particular format, but I would prefer a table or figure solution if it exists, and it would be perfect if it did not take up more than one or two pages for the largest models.

I feel like I have searched quite thoroughly for a solution, but I can not seem to find any.

Topic transfer-learning machine-learning-model deep-learning visualization machine-learning

Category Data Science

Nikos M. · Accepted Answer · 2022年5月29日 13:48

A way to summarize a complex architecture is to use "abstract boxes" for certain known and well defined parts and computations, instead of their complete detailed architecture.

Thus a complex large model can be represented as a set of simpler abstract boxes representing different computations and layers. In this sense conciseness is retained in représentation without sacrificing clarity.

For example see the following summarization of VGG16

In an even larger net which includes VGG16 as subnetwork one can include only one "box" for the whole VGG16 network, since it can be considered as known and well-defined.

One can use any schematics software to draw the network summary (eg DIA)

How to summarize very large neural networks?

About