Computational molecular dynamics involves the simulation of particles under the influence of inter-particle forces. MD simulations are applicable in many areas, including the study of material properties at the microscopic scale and problems associated with DNA-protein molecular interactions. Parallel computers have spawned increased interest in MD simulations because both larger systems and longer time-scales can be simulated. However, these two benefits are realizable only when the MD algorithm is scalable to large numbers of processors and the computational load is uniformly distributed among the processors. We address the issue of scalability by using an asynchronous mechanism for message passing. Our approach localizes synchronization mechanisms, thereby reducing the latency effects of large multi-processor networks. We also address the issue of load balance by employing a dynamic load balancing scheme which works in one, two, or three-dimensional domains. We use a heuristic based on geometric locality to dynamically determine load alterations between processors. Our MD code is uniquely qualified to maintain high efficiency when simulating large systems, even those with non-uniform particle distributions.