Fredrik, this is great feedback. We’ve gone back and forth on this many times over the years.
It has indeed been a goal to minimize the impact of Apostrophe on the DOM. However we run into some realities. If a widget has no content or the content isn’t very tall we can’t fit the buttons; we have to add height to that widget. If a widget is a rich text widget, we have to create some space to accommodate both reading the text (or clicking into it to edit it) and accessing the buttons for the widget. Etc. Drag and drop operations, as well as widget players, at a minimum need to be able to find a container element for the widget as a whole.
There is also the challenge of repositioning controls if the widget gets moved - allowing it to actually be in the DOM is certainly simpler and feels less “jumpy” than javascript trying to keep them together as a widget is dragged.
But, having the controls nest in the widget has all sorts of implications as you’ve pointed out. This is worth another rethink for 3.x.
I will ask others to chime in on this thread as well.