The common approach to diffuse optical tomography is to solve a nonlinear and ill-posed inverse problem using a linearized iteration process that involves repeated use of the forward and inverse solvers on an appropriately discretized domain of interest. This scheme normally brings severe computation and storage burdens to its applications on large-sized tissues, such as breast tumor diagnosis and brain functional imaging, and prevents from using the matrix-fashioned linear inversions for improved image quality. To cope with the difficulties, we propose in this paper a parallelized full domain-decomposition scheme, which divides the whole domain into several overlapped subdomains and solves the corresponding subinversions independently within the framework of the Schwarz-type iterations, with the support of a combined multicore CPU and multithread graphics processing unit (GPU) parallelization strategy. The numerical and phantom experiments both demonstrate that the proposed method can effectively reduce the computation time and memory occupation for the large-sized problem and improve the quantitative performance of the reconstruction.