Probably a better way to ask is at
/tech/
For anyone who wants to give constructing this text corpus a shot Capital has sadly been removed from Marxists.org for copyright claims
Fortunately I think
http://marx2mao.com/ still has copies
Anybody who wants to help wants to help the text to be extracted and concatenated is from volumes 35, 36 and 37 of Marx and Engel's collected works