When training a video action classifier model in Create ML, is it best to have only one person's poses in the frame (and crop out others)?
Yes.
If you have multiple people in the screen. Try to keep other people consistently smaller than the main person. Then it will still work (automatically select the maximum bounding box person)
Check out the video from WWDC 2020:
Build an Action Classifier with Create ML (at 24m21s)
When it comes to using the model in your applications, make sure to only select a single person. Your app may remind users to keep only one person in view when multiple people are detected, or you can implement your own selection logic to choose a person based on their size or location within the frame, and this can be achieved by using the coordinates from pose landmarks.