First of all:
There is no way to determine a good network topology just from the
number of inputs and outputs. It depends critically on the number of
training examples and the complexity of the classification you are
trying to learn.[1]
and Yoshua Bengio has proposed a very simple rule:
Just keep adding layers until the test error does not improve anymore.[2]
Moreover:
The earlier features of a ConvNet contain more generic features (e.g.
edge detectors or color blob detectors) that should be useful to many
tasks, but later layers of the ConvNet becomes progressively more
specific to the details of the classes contained in the original
dataset.[3]
For example, in a method for learning feature detectors:
first layer learns edge detectors and subsequent layers learn more complex features, and higher level layers encode more abstract features. [4]
So, using two dense layers is more advised than one layer.
Finally:
The original paper on Dropout provides a number of useful heuristics to consider when using dropout in practice. One of them is:
Use dropout on incoming (visible) as well as hidden units. Application of dropout at each layer of the network has shown good results. [5]
in CNN, usually, a Dropout layer is applied after each pooling layer, and also after your Dense layer. A good tutorial is here [6]
References:
[1] https://www.cs.cmu.edu/Groups/AI/util/html/faqs/ai/neural/faq.html
[2] Bengio, Yoshua. "Practical recommendations for gradient-based training of deep architectures." Neural networks: Tricks of the trade. Springer Berlin Heidelberg, 2012. 437-478.
[3] http://cs231n.github.io/transfer-learning/
[4] http://learning.eng.cam.ac.uk/pub/Public/Turner/Teaching/ml-lecture-3-slides.pdf
[5] https://machinelearningmastery.com/dropout-regularization-deep-learning-models-keras/
[6] https://cambridgespark.com/content/tutorials/convolutional-neural-networks-with-keras/index.html