And by this I mean the old ones who implemented forked files in HFS original, created two forks with fundamentally different behaviors.
The data fork is what you would think of as a traditional data file - its a bytestream and all the routines to manipulate it treat it like a stream.
The resource fork is more like a database - it maintains an index and provides a means of pulling "named" items from it. It was primarily used for persistent storage of Window Records, Control Records, all of the literal text from the program (for localization), colors, layouts, etc.... There were also code resources - hunks of binary that implemented components of sorts. In fact, old style 68k mac applications had all the binary in the resource fork as various code resources. Hypercard would let you extend the hypertalk language by building a code resource and adding it to a "stack" file. The name of the resource was the name of the function in the Hypertalk language.
The original goal initially was to allow a non-programmer to rework the appearance of an application without ever touching the code. For this basic use, it worked OK. Sadly, it wasn't particularly scalable in implementation and was useful enough to become badly abused by developers. For instance, any given resource couldn't be more than 32k in size (I think - its been awhile - but something lame like that) and you can't have more than 32k resources of a given type.
To answer your other questions, I believe the file sizes reported by HFS take both forks into account.
Because the forks had fundamentally different use patterns and separate apis, nobody ever mixed them up.
Unfortunately, this "trick" became a liability in the wider world because the legacy file systems don't support forked files and many utilities simply left the resource fork behind.
Which sort of begs the question "whats the difference between a directory containing several files of different types and a file containing several different named streams".
From a practical standpoint, I don't see one.