1)The most difficult steps in the task might be the following:
Step 3: This might form a bottleneck because the pencil must be slightly released before it can be rotated, while making sure it doesn't fall
Step 7: The rotation in this step involves all three degrees of freedom of the effectors, and hence may be very difficult to learn simultaneously
Step 13: This step involves relative motion between the pencil and the paper, and the robot must make sure the motion is smooth while avoiding slipping. This might be difficult because it requires constant feedback
2)The explicit steps involved in the task could be the instructions involving locating the coordinates or orientation of the pencil
The implicitly learnt aspects of the execution could be the learnt application of appropriate forces to hold and maneuvre the pencil
The implementation involves just one chunk, because a single action sequence is being initiated once the robot has learnt it.
3)Humans do use similar reward based learning, however in human actions, the processes that constitute the sequence are first rewarded and reinforced in succession and eventually the entire act is rewarded on its completion. So reward is involved at every incremental step of learning. This facilitates a heirarchical building up of higher order actions/concepts.
Also, there exist intrinsically motivated actions at the lower levels, that can initiate certain exploratory behaviour, acting as the foundation for reward based learning.
In accordance to this, the firefighter example can also be considered as reward-based process learning.