Let's start with a bald statement: I believe that the web will continue to use the current lingua franca of page description, HTML 4.01. This is the zenith of the HTML series of standards: it describes the use of styles to provide separation between document model and appearance, and standardises the use of plug-in objects.
I see XHTML 1.0 and later as being a solution in search of a problem. XHTML 1.0 is a reformulation of HTML using the stricter XML model, which should allow a standard XML parser to parse XHTML successfully.
Unfortunately, XHTML is strictly incompatible with HTML 4.0. The problem stems from the requirement that all elements in XML are closed, whereas HTML does not require this. XML offers a simpler syntax for elements with no content, e.g. <br />. If HTML is interpreted strictly, that / is illegal.
The intent of XHTML is to split HTML down into modules, which can be implemented as required by a browser. The unfortunate part is of course that large swathes of the existing Web already contain elements missing from, for example, the XHTML Basic profile. To remain usable for a given user, the user's browser must implement all of HTML 4.0 - making XHTML basically pointless.
Of course, IE has problems with XHTML anyway. The Jargon File renders strangely on IE due to mismatched character-set information. The server doesn't supply any character set information: the HTTP headers only indicate Content-Type: text/html. The file I linked to is formulated as XHTML (and shouldn't be transmitted as text/html anyway); the <?xml?> processing instruction indicates encoding="UTF-8". IE uses its default character set, Windows-1252, to display the data, leading to the wrong result. It does this because the HTTP header didn't indicate a character set. IE also goes into Quirks mode, because there isn't a valid HTML 4.0 DTD.