A variational autoencoder (VAE) model that captures global and local variations in its representation can improve generalization performance in downstream tasks. We propose the Separated Paths for Local and Global Information (SPLIT) framework that modifies standard VAE models, enabling them to disentangle global and local visual features explicitly. We apply our framework to solve several downstream tasks: clustering, unsupervised object recognition, and visual reinforcement learning. The learned representations are found to be useful for these tasks and improve out-of-distribution generalization performances.