The transmission of highly pathogenic avian influenza H5N1 virus to Southeast Asian countries triggered the first major outbreak and transmission wave in late 2003, accelerating the pandemic threat to the world. Due to the lack of influenza surveillance prior to these outbreaks, the genetic diversity and the transmission pathways of H5N1 viruses from this period remain undefined. To determine the possible source of the wave 1 H5N1 viruses, we recently conducted further sequencing and analysis of samples collected in live-poultry markets from Guangdong, Hunan, and Yunnan in southern China from 2001 to 2004. Phylogenetic analysis of the hemagglutinin and neuraminidase genes of 73 H5N1 isolates from this period revealed a greater genetic diversity in southern China than previously reported. Moreover, results show that eight viruses isolated from Yunnan in 2002 and 2003 were most closely related to the clade 1 virus sublineage from Vietnam, Thailand, and Malaysia, while two viruses from Hunan in 2002 and 2003 were most closely related to viruses from Indonesia (clade 2.1). Further phylogenetic analyses of the six internal genes showed that all 10 of those viruses maintained similar phylogenetic relationships as the surface genes. The 10 progenitor viruses were genotype Z and shared high similarity (>/=99%) with their corresponding descendant viruses in most gene segments. These results suggest a direct transmission link for H5N1 viruses between Yunnan and Vietnam and also between Hunan and Indonesia during 2002 and 2003. Poultry trade may be responsible for virus introduction to Vietnam, while the transmission route from Hunan to Indonesia remains unclear.