Large-scale scientific applications present great challenges to computational scientists in terms of obtaining high performance and in managing large datasets. These applications (most of which are simulations) may employ multiple techniques and resources in a heterogeneously distributed environment. Effective working in such an environment is crucial for modern large-scale simulations. In this paper, we present an integrated Java graphical user interface (IJ-GUI) that provides a control platform for managing complex programs and their large datasets easily. As far as performance is concerned, we present and evaluate our initial implementation of two optimization schemes: data replication and data prediction. Data replication can take advantage of 'temporal locality' by caching the remote datasets on local disks; data prediction, on the other hand, provides prefetch hints based on the datasets' past activities that are kept in databases. We first introduce the data contiguity concept in such an environment that guides data prediction. The relationship between the two approaches is discussed.