天道酬勤,学无止境

Perl6 (Rakudo) - How to handle special characters from file?

How can I read special characters from a external file ? Here a simple .txt file in French, which content is the first paragraph of https://fr.lipsum.com/ : as you can see on my screenshot, the file encoding is UTF-8 but the accents are not displayed correctly.

I tried various encodings within notepad++ and in my perl6 script, like these :

enc => "utf8"
enc => "latin1"

With Python or Ruby scripts I don't encounter the problem. I can't found any precise example about that matter, probably because perl 6 is still quite recent (??). Thank you.

My script as it is displayed in the screenshot :

my $text_contents = slurp "testfile.txt", enc => "utf8";
say $text_contents;
prompt;

Perl6 script, input file in notepad++, exec in cmd.exe


Final edit : the solution is to enable an option, available in beta state with Windows 10 1803, to make the OS handle unicode characters properly : see answers and comments below ...

评论

If you're not using Windows

This SO is either entirely or almost entirely irrelevant to you.

If you're using Windows 10

Check the "Beta: Use Unicode UTF-8 for worldwide language support" option checkbox.

At least at the time I originally wrote this answer, text near this Unicode related checkbox claimed it's for programs that do not support Unicode, but you should just ignore that.[1]

At the time I originally wrote this answer the checkbox was found under control panel, "Region" entry, "Administrative" tab, "Change system locale" button.

Microsoft may have changed this stuff since I wrote this answer, and may change it again, eg by moving and/or renaming the checkbox, or making things more involved than just clicking a single checkbox.

Per their comment below this answer, the OP notes:

For those who are interested in that particular option, it can be found in the "legacy" Control panel of windows -> Region -> Administrative -> Edit settings...

If you're using an older version of Windows

Arguably, the good news is that Raku and Rakudo have some of the world's best modern support for Unicode, and the OK news is that it relies on Microsoft correctly supporting Unicode, which they're now trying to do.

The bad news is that they made a lot of mistakes in older versions of Windows (and even in Windows 10, which they're now trying to fix), so any solution will be constrained by those mistakes. (Perhaps the biggest problem is Microsoft's doublespeak on the topic[1], but let's hope we can work around that.)

That all said, please read the following and then either return to searching for solutions or post a fresh SO question and we'll try to help.


Quoting Wikipedia's page Unicode in Microsoft Windows:

they are still in 2018 improving their operating system support for UTF-8

Microsoft got off on the wrong foot with their Unicode support last century. The good news is that they have at last begun digging their way out of the hole they dug for themselves and everyone else.

But they're definitely not there yet -- not at the time of originally writing this answer, and, I suspect not for another N years -- at least inasmuch as things don't work correctly out of the box for many end users. I think this is the root of most problems with Unicode on Windows.

Older languages like Python, Ruby and Perl came up with a range of hacks that hid the many problems with Microsoft's older UTF8 support from most users in simple scenarios by using what Microsoft ironically described as "Unicode support".

This has always come with the trade-off that things get very hairy or even completely unworkable for more complex applications in many locales around the world. (So much so that even the mighty Microsoft finally capitulated in 2018.)

In essence, until this new Microsoft effort to get with the program, software that ran on Windows has had no alternative but to either use their fundamentally broken "Unicode support" or to actually support Unicode properly.[1]

Raku and Rakudo focused on the latter, and problems with it when run on Windows are related to this conflicting with Microsoft's old broken approach. Fortunately Microsoft is now getting with the program and so we may be able find a way to get around problems you have with Unicode on Windows provided you are patient.

In particular, if you are using an older Windows version, please expect it to not work at first with modern Unicode aware software unless you are lucky. We'll still help if we can, but it'll likely involve you being patient with us and Microsoft and Rakudo and vice-versa.

Footnotes

[1] At the time I originally wrote this answer, there is text near the checkbox that it's for programs that do not support Unicode. This is entirely the opposite of what's really going on, but hey, it's Microsoft.

受限制的 HTML

  • 允许的HTML标签:<a href hreflang> <em> <strong> <cite> <blockquote cite> <code> <ul type> <ol start type> <li> <dl> <dt> <dd> <h2 id> <h3 id> <h4 id> <h5 id> <h6 id>
  • 自动断行和分段。
  • 网页和电子邮件地址自动转换为链接。

相关推荐
  • 当我尝试使用 Rakudo 运行我的脚本时,为什么会出现“除以零”错误?(Why do I get 'divide by zero` errors when I try to run my script with Rakudo?)
    问题 我刚刚构建了 Rakudo 和 Parrot,这样我就可以使用它并开始学习 Perl 6。我下载了 Perl 6 书并愉快地输入了第一个演示程序(网球锦标赛示例)。 当我尝试运行该程序时,出现错误: Divide by zero current instr.: '' pc -1 ((unknown file):-1) 我在构建目录中有我的perl6二进制文件。 我在 rakudo 构建目录下添加了一个脚本目录: rakudo |- perl6 \- scripts |- perlbook_02.01 \- scores 如果我尝试从我的脚本目录运行一个简单的 hello world 脚本,我会收到同样的错误: #!/home/daotoad/rakudo/perl6 use v6; say "Hello nurse!"; 但是,如果我从rakudo目录运行它, rakudo可以工作。 听起来我需要设置一些环境变量,但我不知道它们是什么以及赋予它们什么值。 有什么想法吗? 更新: 在这一点上,我宁愿不安装 rakudo,我宁愿只从构建目录运行东西。 这将允许我在尝试不同的 Perl6 构建(Rakudo * 很快推出)时对我的系统所做的更改保持最小。 README 文件鼓励我认为这是可能的: $ cd rakudo $ perl Configure.pl --gen
  • Why do I get 'divide by zero` errors when I try to run my script with Rakudo?
    I just built Rakudo and Parrot so that I could play with it and get started on learning Perl 6. I downloaded the Perl 6 book and happily typed in the first demo program (the tennis tournament example). When I try to run the program, I get an error: Divide by zero current instr.: '' pc -1 ((unknown file):-1) I have my perl6 binary in the build directory. I added a scripts directory under the rakudo build directory: rakudo |- perl6 \- scripts |- perlbook_02.01 \- scores If I try to run even a simple hello world script from my scripts directory I get the same error: #!/home/daotoad/rakudo/perl6
  • 在 Rmarkdown 中执行 Perl 6 代码(Executing Perl 6 code in Rmarkdown)
    问题 我想写一些关于 Perl 6 的教程。为此我相信 Rmarkdown 会很有帮助。 所以我试图在 Rmarkdown 文档中执行Perl 6代码。 我的 Perl 6 可执行文件在C:\rakudo\bin 。 所以我的.Rmd文件和示例代码来完成这个如下: --- title: "Example" output: html_document --- ```{r, engine='perl6', engine.path='C:\\rakudo\\bin'} my $s= "knitr is really good"; say $s; ``` 然而,在 Rstudio 中编织上述文档显示以下没有 Perl 6 输出。 我失踪的地方有什么帮助吗? 回答1 不是我的专业领域,但在博客的帮助下,我设法让它产生了输出。 首先,查看 RStudio 的R Markdown选项卡。 它会向您显示一条警告,解释为什么您的版本没有呈现任何内容: Warning message: In get_engine(options$engine) : Unknown language engine 'perl6' (must be registered via knit_engines$set()). 因此,考虑到这一点,我们可以查找如何注册引擎并这样做: ```{r setup, echo=FALSE
  • 未找到动态变量@*INC(Dynamic variable @*INC not found)
    问题 所以我一直在尝试让电子与 Perl6 一起工作,看起来在我努力破解事物以使其工作之后,它只是不想做它的事情。 我使用了以下脚本(来自 git 电子仓库的示例之一): #!/usr/bin/env perl6 use v6; use Electron; my $app = Electron::App.instance; LEAVE { $app.destroy if $app.defined; } say Electron::Dialog.show-open-dialog.perl; say Electron::Dialog.show-save-dialog.perl; say Electron::Dialog.show-message-box.perl; Electron::Dialog.show-error-box("Text", "Content"); prompt("Press any key to exit"); 在运行时,我收到此错误: Dynamic variable @*INC not found in submethod initialize at C:\rakudo\share\perl6\site\sources\42D84B59BC3C5A414EA59CC2E3BC466BBAF78CDA line 54 in method instance at C
  • 使用Perl6处理大文本文件,太慢了。(2014-09)(Using Perl6 to process a large text file, and it's Too Slow.(2014-09))
    问题 https://github.com/yeahnoob/perl6-perf 中的代码托管,如下: use v6; my $file=open "wordpairs.txt", :r; my %dict; my $line; repeat { $line=$file.get; my ($p1,$p2)=$line.split(' '); if ?%dict{$p1} { %dict{$p1} = "{%dict{$p1}} {$p2}".words; } else { %dict{$p1} = $p2; } } while !$file.eof; 当“wordpairs.txt”很小时运行良好。 但是当“wordpairs.txt”文件大约有140,000行(每行,两个字)时,它运行的非常慢。 并且它无法自行完成,即使在运行 20 秒后也是如此。 它有什么问题? 代码有问题吗?? 感谢任何人的帮助! 以下内容已添加@ 2014-09-04,感谢来自 SE Answers 和 IRC@freenode#perl6 的许多建议 代码(目前,2014-09-04): my %dict; grammar WordPairs { token word-pair { (\S*) ' ' (\S*) "\n" } token TOP { <word-pair>* } } class
  • Does changing Perl 6's $*OUT change standard output for child processes?
    I was playing around with shell and how it acts when I change the standard filehandles in the calling program. Proc says: $in, $out and $err are the three standard streams of the to-be-launched program, and default to "-", which means they inherit the stream from the parent process. As far as I can tell, the external program doesn't use the same file handles: #!/Applications/Rakudo/bin/perl6 #`( make an external Perl 6 program the outputs to standard handles ) my $p6-name = 'in-out.p6'.IO; #END try $p6-name.unlink; # why does this cause it to fail? my $p6-fh = open $p6-name, :w; die "Could not
  • Dynamic variable @*INC not found
    So I've been trying to get electron working with Perl6 and looks like after all my efforts of hacking things to get them to work, it just doesn't want to do it's thing. I have used the following script (one of the examples from the electron repo on git): #!/usr/bin/env perl6 use v6; use Electron; my $app = Electron::App.instance; LEAVE { $app.destroy if $app.defined; } say Electron::Dialog.show-open-dialog.perl; say Electron::Dialog.show-save-dialog.perl; say Electron::Dialog.show-message-box.perl; Electron::Dialog.show-error-box("Text", "Content"); prompt("Press any key to exit"); On Running
  • perl6/rakudo: Does perl6 enable “autoflush” by default?
    #!perl6 use v6; my $message = "\nHello!\n\nSleep\nTest\n\n"; my @a = $message.split( '' ); for @a { sleep 0.3; .print; } Does perl6 enable "autoflush" by default. With perl5 without enabling "outflush" I don't get this behavior.
  • 更改 Perl 6 的 $*OUT 是否会更改子进程的标准输出?(Does changing Perl 6's $*OUT change standard output for child processes?)
    问题 我在玩shell以及当我更改调用程序中的标准文件句柄时它的行为方式。 Proc 说: $in、$out 和 $err 是待启动程序的三个标准流,默认为“-”,表示继承父进程的流。 据我所知,外部程序不使用相同的文件句柄: #!/Applications/Rakudo/bin/perl6 #`( make an external Perl 6 program the outputs to standard handles ) my $p6-name = 'in-out.p6'.IO; #END try $p6-name.unlink; # why does this cause it to fail? my $p6-fh = open $p6-name, :w; die "Could not open $p6-name" unless ?$p6-fh; $p6-fh.put: Q:to/END/; #!/Applications/Rakudo/bin/perl6 $*ERR.say( qq/\t$*PROGRAM: This goes to standard error/ ); $*OUT.say( qq/\t$*PROGRAM: This goes to standard output/ ); END $p6-fh.close; say $p6-name.e ?? 'File
  • 如何从 perl6 调用 Java 方法(How do I invoke a Java method from perl6)
    问题 use java::util::zip::CRC32:from<java>; my $crc = CRC32.new(); for 'Hello, Java'.encode('utf-8') { $crc.'method/update/(B)V'($_); } say $crc.getValue(); 可悲的是,这不起作用 Method 'method/update/(B)V' not found for invocant of class 'java.util.zip.CRC32' 此代码可从以下链接获得。 这是我能找到的唯一例子 JVM 上的 Rakudo Perl 6(幻灯片) Perl 6 Advent Calendar: Day 03 – Rakudo Perl 6 on the JVM 回答1 最终答案 将下面您的答案清理部分中解释的代码清理与下面的期望警报部分中提到的 Pepe Schwarz 的改进相结合,我们得到: use java::util::zip::CRC32:from<Java>; my $crc = CRC32.new(); for 'Hello, Java'.encode('utf-8').list { $crc.update($_); } say $crc.getValue(); 您的答案已清除 use v6; use java::util
  • perl6/rakudo: How could I disable autoflush?
    I tried this, but it didn't work: $*OUT.autoflush( 0 );
  • Use “perl6” command with Git Bash on windows
    Using Windows, I installed Rakudo Star and Git and ensured that C:\rakudo\bin and C:\rakudo\share\perl6\site\bin are in my Path environment variable. Now, typing perl6 inside Git Bash afterwards gives the command not found error, while the command does work inside powershell and cmd. Typing echo $PATH inside Git Bash confirms again that the folders above are in my path variable here as well. How can I get the perl6 command working inside Git Bash? Note: Using moar (moar.exe) which resides in the same folder as perl6 works as well in Git Bash. Also hitting Tab show the autocomplete suggestion
  • 如果需要该文件,Perl 6 应该运行 MAIN 吗?(Should Perl 6 run MAIN if the file is required?)
    问题 这是一个简短的 Perl 6 程序,它声明了一个 MAIN 子例程。 如果我直接执行程序,我应该只看到输出: $ cat main.pm6 sub MAIN { say "Called as a program!" } 当我直接执行程序时,我看到了输出: $ perl6 main.pm6 Called as a program! 如果我将其作为模块加载,则看不到输出: $ perl6 -I. -Mmain -e "say 'Hey'" Hey 如果我从程序内部use它,我也看不到输出: $ perl6 -I. -e 'use main' 但是,如果我使用require ,我会得到输出: $ perl6 -I. -e 'require <main.pm6>' Called as a program! 概要 06 字面意思是编译单元是直接调用的,而不是被要求调用。 是否有其他事情发生,因为require在运行时起作用(尽管 S06 不排除这一点)? 我得到了与 Rakudo Star 2016.07 和 2016.10 相同的行为。 回答1 首先,让我们看看require应该如何表现: 根据(非权威)设计文件, 或者,可以直接提及文件名,这会安装一个对当前词法范围有效匿名的包,并且只能由模块安装的任何全局名称访问: 和 只有明确提到的名字才能被导入。 为了保护词法本的运行时神圣性
  • How do I invoke a Java method from perl6
    use java::util::zip::CRC32:from<java>; my $crc = CRC32.new(); for 'Hello, Java'.encode('utf-8') { $crc.'method/update/(B)V'($_); } say $crc.getValue(); sadly, this does not work Method 'method/update/(B)V' not found for invocant of class 'java.util.zip.CRC32' This code is available at the following links. It is the only example I've been able to find Rakudo Perl 6 on the JVM (slides) Perl 6 Advent Calendar: Day 03 – Rakudo Perl 6 on the JVM
  • 我可以在Perl 5中为字符串创建文件句柄,如何在Perl 6中做到这一点?(I can create filehandles to strings in Perl 5, how do I do it in Perl 6?)
    问题 在Perl 5中,我可以为字符串创建文件句柄,并像对待文件一样对字符串进行读取或写入。 这对于使用测试或模板非常有用。 例如: use v5.10; use strict; use warnings; my $text = "A\nB\nC\n"; open(my $fh, '<', \$text); while(my $line = readline($fh)){ print $line; } 如何在Perl 6中做到这一点? 以下内容不适用于Perl 6(至少对于我在2015年1月发行的Rakudo Star于64位CentOS 6.5上在MoarVM 2015.01上运行的Perl6实例而言): # Warning: This code does not work use v6; my $text = "A\nB\nC\n"; my $fh = $text; while (my $line = $fh.get ) { $line.say; } # Warning: Example of nonfunctional code 我收到错误消息: No such method 'get' for invocant of type 'Str' in block <unit> at string_fh.p6:8 Perl5的open(my $fh, '<', \$text
  • 如何在 Perl6 中编写自定义访问器方法?(How does one write custom accessor methods in Perl6?)
    问题 如何在 Perl6 中编写自定义访问器方法? 如果我有这门课: class Wizard { has Int $.mana is rw; } 我可以做这个: my Wizard $gandalf .= new; $gandalf.mana = 150; 假设我想在不放弃$gandalf.mana = 150;情况下为 Perl6 类中的 setter 添加一点检查$gandalf.mana = 150; 符号(换句话说,我不想写这个: $gandalf.setMana(150); )。 如果程序试图设置负法力,它应该死掉。 我该怎么做呢? Perl6 文档只是提到可以编写自定义访问器,但没有说明如何编写。 回答1 您可以通过声明一个方法is rw来获得与$.mana提供的相同的访问器接口。 然后你可以在底层属性周围包裹一个代理,如下所示: #!/usr/bin/env perl6 use v6; use Test; plan 2; class Wizard { has Int $!mana; method mana() is rw { return Proxy.new: FETCH => sub ($) { return $!mana }, STORE => sub ($, $mana) { die "It's over 9000!" if ($mana // 0) >
  • Perl6 hyper » 运算符不像地图那样工作(Perl6 hyper » operator doesn't work like map)
    问题 据我了解,超级运算符»是map()的快捷方式。 为什么以下返回两个不同的结果,而在第二个示例中.sum似乎没有被应用? say ([1,2], [2, 2], [3, 3]).map({.sum}); # (3 4 6) say ([1,2], [2, 2], [3, 3])».sum; # ([1 2] [2 2] [3 3]) 回答1 Hyperops 递归地下降到子列表中。 它们也是自动线程(NYI)的候选者,这意味着它们的操作是无序的。 还有一个错误已通过 https://github.com/rakudo/rakudo/commit/c8c27e93d618bdea7de3784575d867d9e7a2f6cb 得到纠正。 say ([1,2], [2, 2], [3, 3])».sum; # (3 4 6) 回答2 TL;DR您几乎肯定遇到过错误。 也就是说, map和» hyperop 有很大的不同。 map返回一个 Seq。 此Seq产生将用户提供的代码应用于用户提供的数据结构的每个元素的结果: 一层深(数据结构的遍历很浅—— map不会递归下降到数据结构顶层的子结构) 一次一个(一切都按顺序完成,没有并行) 懒惰( map立即返回;用户提供的代码应用于用户提供的数据结构,以便稍后根据需要从Seq提取值生成结果) » hyperop
  • 如何在 Perl6 中查看散列的内容(以类似于 Perl 5 模块 Data::Dump 或 Data::Show 的方式)?(How can I view the contents of a hash in Perl6 (in a fashion similar to the Perl 5 modules Data::Dump or Data::Show)?)
    问题 在 Perl 5 中,如果我想查看散列的内容,可以使用 Data::Show、Data::Dump 或 Data::Dumper。 例如: use Data::Show; my %title_for = ( 'Book 1' => { 'Chapter 1' => 'Introduction', 'Chapter 2' => 'Conclusion', }, 'Book 2' => { 'Chapter 1' => 'Intro', 'Chapter 2' => 'Interesting stuff', 'Chapter 3' => 'Final words', } ); show(%title_for); 哪些输出: ======( %title_for )======================[ 'temp.pl', line 15 ]====== { "Book 1" => { "Chapter 1" => "Introduction", "Chapter 2" => "Conclusion" }, "Book 2" => { "Chapter 1" => "Intro", "Chapter 2" => "Interesting stuff", "Chapter 3" => "Final words", }, } Perl 6 中有什么等价的东西吗? 我记得我记得
  • Perl6 REPL usage
    Is it possible to have (Rakudo) Perl6 execute some code before dropping you into the REPL? Like python does with "python -i ". For instance, I want to load up some modules and maybe read a side file and build some data structures from that side file before dropping into the REPL and letting the user do the things they need to do on the data structure, using the REPL as a user interface. This is similar but different than Start REPL with definitions loaded from file though answers to this question might satisfy that one. The basic case is that, at the end of execution of any program, instead of
  • How to compile a shared library on Windows such that it can be used with NativeCall in raku?
    I am trying to compile a DLL library on Windows that can be used with NativeCall in Raku. Here is a minimal C code (my_c_dll.c): #include <stdio.h> #define EXPORTED __declspec(dllexport) extern __declspec(dllexport) void foo(); void foo() { printf("Hello from C\n"); } I am on Windows 10 and have installed Build Tools for Visual Studio 2019. To compile the DLL I open a "Developer Command Prompt for VS 2019" and run: > cl.exe /c my_c_dll.c > link /DLL /OUT:my_c_dll.dll my_c_dll.obj This creates a DLL my_c_dll.dll, then I try to use this from Raku (test-dll.raku): use v6.d; use NativeCall; sub