PC SOFT

AYUDA EN LÍNEA
DE WINDEV, WEBDEV Y WINDEV MOBILE


  • Conversion rules
  • Supported tags
  • Managing the character set
  • Limitations
WINDEV
WindowsLinuxUniversal Windows 10 AppJavaReportes y ConsultasCódigo de Usuario (UMC)
WEBDEV
WindowsLinuxPHPWEBDEV - Código Navegador
WINDEV Mobile
AndroidWidget Android iPhone/iPadApple WatchUniversal Windows 10 AppWindows Mobile
Otros
Procedimientos almacenados
HTMLToText (Function)
 
Converts an HTML string or an HTML buffer into text string. The following operations are performed during the conversion:
  • The HTML tags are deleted,
  • The special HTML characters are converted,
  • The CR characters (Carriage Return) are converted into space characters,
  • The multiple spaces are converted into unique spaces.
Versiones 15 y posteriores
PHP This function is now available for the PHP sites.
Android This function is now available for the Android applications.
Nueva funcionalidad versión 15
PHP This function is now available for the PHP sites.
Android This function is now available for the Android applications.
PHP This function is now available for the PHP sites.
Android This function is now available for the Android applications.
Versiones 18 y posteriores
Widget Android This function is now available in Android Widget mode.
Nueva funcionalidad versión 18
Widget Android This function is now available in Android Widget mode.
Widget Android This function is now available in Android Widget mode.
Versiones 19 y posteriores
WINDEVLinux This function is now available for the WINDEV applications in Linux.
WEBDEV - Código ServidorLinux This function is now available for the WEBDEV sites in Linux.
Nueva funcionalidad versión 19
WINDEVLinux This function is now available for the WINDEV applications in Linux.
WEBDEV - Código ServidorLinux This function is now available for the WEBDEV sites in Linux.
WINDEVLinux This function is now available for the WINDEV applications in Linux.
WEBDEV - Código ServidorLinux This function is now available for the WEBDEV sites in Linux.
Versiones 21 y posteriores
iPhone/iPad This function is now available for the iPhone/iPad applications.
Universal Windows 10 App This function is now available in Universal Windows 10 App mode.
Nueva funcionalidad versión 21
iPhone/iPad This function is now available for the iPhone/iPad applications.
Universal Windows 10 App This function is now available in Universal Windows 10 App mode.
iPhone/iPad This function is now available for the iPhone/iPad applications.
Universal Windows 10 App This function is now available in Universal Windows 10 App mode.
Ejemplo
MyHTMLText is string = "<!--test-->&lt;b&gt;&lt;i&gt;&amp;quot;Hello!&amp;quot;&lt;/i&gt;&lt;/b&gt;"
Text is string = HTMLToText(MyHTMLText)
// Text is set to: "Hello"!
WINDEVWEBDEV - Código ServidorReportes y ConsultasCódigo de Usuario (UMC)
// If the HTML document is set to:
//<HTML>
// <HEAD>
//  <TITLE>This is a test for a Web page</TITLE>
//  <META http-equiv="content-type" content="text/html; charset=UTF-8">
// </HEAD>
//<BODY>
// <P>This is &nbsp;&nbsp;&nbsp;&nbsp; an HTML page in English</P>
// It contains 1 paragraph<BR /><DD>a tabulation<BR />and 3 line skips
//  <BR /><A href="http://www.pcsoft.fr">This is a link</A>
// </BODY>
//</HTML>

Text = HTMLToText(MyHTMLText)
// Text will contain:
// This is        an HTML page   in English.
//
// It contains 1 paragraph
//   a tabulation
// and 3 line skips
// This is a link
Sintaxis
<Result> = HTMLToText(<Text in HTML Format> [, <Charset Used>])
<Result>: Character string
Text corresponding to the result of the HTML conversion. The encoding used is the one of the current character set of WINDEV or WEBDEV.
<Text in HTML Format>: Character string or buffer (with quotes)
Text to convert.
<Charset Used>: Optional Integer constant
Constant identifying the character set used to write the <Text in HTML Format>.
The current character set of WINDEV or WEBDEV is used by default (charsetCurrent constant).
If information about the character set used is found in the <Text in HTML Format>, this information has priority over this parameter.
See Correspondence between languages, sub-languages, character sets and nations for more details.
AndroidWidget Android This parameter is not available
Observaciones

Conversion rules

  • The HTML tags are analyzed in order to keep the best possible formatting in the output text (CR characters, space characters, tabulations). The formatting is not kept: bold, italic, colors, ...
  • Do not appear in the text output:
    • the HTML tags
    • the content of the "header" (information found in the <HEAD> tag)
    • the comments
    • the control texts
    • the scripts
    • the SSL definitions
    • the CSS styles (except color)
    • the form elements
  • Management of CR characters
    • 2 CR characters are inserted to replace the following tags: <P>, <H1> to <H6>, <TABLE>, <UL> or <OL>
    • 1 CR character is inserted to replace the following tags: <BR>, <TR>, <LI>, <DD> or <DIV>
    • 1 single CR character is inserted if several identical tags (<TR>, <LI>, <DD> or <DIV>) are found one after another (except for <BR> tags)
  • Management of arrays
    • A CR character is inserted for each array row (<TR> tag).
    • A tabulation is inserted for each array column (<TD> tag).
  • Management of special characters
    A special character is a character defined in the HTML standard. For example, a space character can be written as " ". This standard is automatically used.

Supported tags

The unsupported tags are ignored: their content is taken into account as text.
The supported tags are as follows:
  • <PRE>
  • <UL>: Line break + tabulation
  • <OL>: Line break + tabulation
  • <LI>: Tabulation
  • <H1>: Line break before and line break after
  • <H2>: Line break before and line break after
  • <H3>: Line break before and line break after
  • <H4>: Line break before and line break after
  • <H5>: Line break before and line break after
  • <H6>: Line break before and line break after
  • <P>: Line break before and line break after
  • <BR>: Line break
  • <DL>: Line break
  • <DT>: Line break
  • <DD>: Tabulation and line break
  • <TABLE>: Line break
  • <TR>: Line break
  • <TD>: Elements separated by a tabulation
  • <HEAD>: Content ignored, except for the parameters of the character set
  • <STYLE>: Content ignored
  • &lt;SCRIPT&gt: Content ignored
  • <!-- -->: Comments ignored

Managing the character set

To find out the character set used in the HTML text, HTMLToText uses the information found in the CONTENT attribute of a <META> tag.
If this tag is not found, the character set used to write the HTML text must be specified in <Charset Used>.
Indeed, if the HTML content is using an Arabic character set while WINDEV/WEBDEV use a French character set by default, invalid characters will be found in the output text.
Notes:
  • If the output text contains several "?" characters, it means that the character of the character set used in the HTML document cannot be expressed with a character of the current language.
  • The UTF8 character set is commonly used to encode the Web pages.
AndroidWidget Android

Limitations

The result produced by HTMLToText in Android may differ from the one produced in Windows. The mentioned conversion rules and the list of generated tags do not apply in Android.
Componente : wd240rtf.dll
Versión mínima requerida
  • Versión 12
Esta página también está disponible para…
Comentarios
Haga clic en [Agregar] para publicar un comentario