`
pundit
  • 浏览: 230565 次
  • 性别: Icon_minigender_1
  • 来自: 火星
社区版块
存档分类
最新评论

F#语言

阅读更多

翻译得不错,摘录:

F#并行与异步编程
2009-11-30 15:23

 

F#的程序经理Luke Hoban发表了一场名为“F#并行与异步编程”的演讲,其中提出了并行与异步编程的多个重点与挑战,并解释了F#是如何从语言特性及类库框架两方面来给出合适应对方案。

F#是.NET平台上的又一门编程语言,结合了函数式编程及面向对象编程两种编程范式。微软研究院在5至6年开始着手设计并开发F#,并且将随 Visual Studio 2010一起发布其稳定版本及相关工具包,可用于产品的开发。与C#和VB一样,F#是一门强类型的静态语言。Luke指出,F#是一门通用语言,可用于各种程序的开发,不过它的许多特性非常适用于开发那些算法性强,或是并行和异步占较大比重的应用程序。

Luke在演讲中提出了并行编程的四项挑战,其中第1个是“状态共享”。这里的共享状态特指可变(mutable)的状态,即可能被多个并行的组件所同时修改的内存。当遇到这样的情况,这意味着每次修改都可能影响多个组件,如果处理不当便可能造成难以预料的情况。而此时,从逻辑上分割各组件事实上也并非互相独立的,这给系统维护带来了困难。这种情况也很难测试,因为重现某个测试需要各并行的组件都处在特定的状况下。最大的问题可能是这样的代码很难高度并行,多个并行的组件如果需要共享同一块内存,则几乎一定会用到锁。锁很难处理,因为这非常依赖于各个组件是如何使用,以及何时使用某块内存的。如果程序新增了一个组件,甚至只是对现有组件做出少量修改,这可能就会让并行的应用程序对新的内存造成共享,于是便不知不觉地破坏了线程安全。
F#提出在语言特性上强化不可变的(immutable)程序开发方式。不可变的编程方式表示尽可能避免那些可修改的内存数据。在F#中,默认情况下的所有变量、函数、参数等等,一旦绑定(bind)至某个标识符后都是不可修改的。F#中的一些常用的数据类型,如Record,Tuple或 Discriminated Union都是不可变的。如果想要一个状态不同的对象,开发人员只能“新建”而不能“修改”,而F#也提供了一定的语言特性来辅助此类操作。由于不可变性,在F#中便可以轻松使用各种方式进行并行计算,而不必担心线程安全问题。例如,可以使用.NET 4.0中的PLinq——在F#则被封装为PSeq模块进行序列的映射,过滤或求和等操作。此外,F#还提供了如List,Set,Map等不可变的常用数据结构。对于它的面向对象编程的部分,Luke指出F#也拥有一些特性,鼓励开发人员构建不可变的类型,即没有set操作,每个方法都只是根据参数进行计算并返回结果,而不是改变内部状态的类型。

Luke提出的第2个挑战是异步编程中的控制切换(Inversion of Control)问题。他认为,开发人员一直习惯于编写顺序的程序,即使用一行代码接着另一行代码的方式来实现逻辑。但是对于一些耗时很长的操作来说,这么做会阻塞程序的主线程,如在UI程序中阻塞主线程则会引起界面的僵死,此时往往需要异步调用。但是,异步调用需要将程序逻辑分为两个或多个阶段,在执行完一个阶段之后,再将结果通过回调函数传递给下一个。但编写这样的代码非常困难,往往需要为异步程序的控制编写大量代码,例如异常处理或任务取消等等。传统.NET异步编程模型,如解耦的Begin/End方法都无法解决这个问题。当需要异步调用的逻辑越来越多,甚至需要在其中加入一些循环或判断等逻辑,那程序的编写很容易变得越来越复杂。
F#中提供了一个名叫工作流(Workflow)的语言特性来应对这个问题。Workflow可以被认为是F#版本的monad实现,它的主要特色便是由编译器对顺序编写的代码进行desugar操作,形成回调的方式便于异步执行其中某些步骤。Luke演示了一个使用C#编写的,从Azure云中下载图片的WPF应用程序,其中长时间同步操作导致界面僵死。而将这段同步逻辑转化为异步则需要好几页的代码,其中的主要问题便是原本简单的for操作必须交由额外的上下文对象来保存,这样逻辑便在业务部分及异步控制部分中不断切换,造成难以实现和维护的代码。而使用F#实现相同的工作时,只需要使用 async {...}将原有的逻辑包装起来,便形成了一个异步工作流。然后再将其中的一些耗时操作的let和do指令修改为let!或do!,这样便告知F#这两个步骤在执行时需要将控制权交还给框架,在得到结果之后才通过回调函数继续执行后面的逻辑。代码中原本的for循环可以被F#正确的处理,其表现形式和顺序的代码逻辑可谓毫无二致。值得一提的是,演示中Luke使用F#构建的类库可以直接被C#编写的WPF应用程序使用,唯一的修改只是引入了不同的命名空间而已。

第3个挑战是应用程序与I/O设备的交互,例如磁盘或是远程的云,这便是I/O密集型(I/O Bound)逻辑。由于各种I/O设备(如硬盘及网卡)往往是独立的,因此需要同时发起多个I/O请求才能够充分利用资源,提高程序的性能及响应能力。这便涉及到I/O并行(I/O Parallelism)。而使用async { ... }所形成的多个异步工作模块可以由F#组合成单个异步工作块,然后作为.NET 4.0中的任务(Task)执行。每个异步工作块中的I/O异步操作使用let!指令,在工作时可以将控制权交由F#,而保持原有逻辑的顺序性。由于每个 I/O操作都是异步的,它并不会占用应用程序的工作线程。因此,即便是同时发起许多I/O请求,从任务管理器中也可以发现应用程序其实只使用了少量的线程。

最后一个挑战,是指并行应用程序往往只能简单实现向上扩展(Scale Up),而难以扩展至许多廉价机器所组成的集群。如果要有良好的向外扩展(Scale Out)能力,必须从程序设计初期便抱有这样的想法。这往往意味着使用消息和代理(agent)进行编程,它是一种为并行程序提供扩展能力的基础方式。Erlang及微软的Axum都使用了类似的思想,F#也提供了Agent组件,在每次发布过程中这个组件也在不断演化。使用基于Agent的方式,各组件的依赖便消失了,它们完全通过消息传递进行通信。F#的Agent组件是MailboxProcessor,它的Start函数会提供一个inbox。开发人员可以使用inbox的 Receive方法发起一个非阻塞的接受操作,由于使用了异步工作块及let!指令,这行代码并不会阻塞线程,而是把控制权交由F#,直至获得一个消息。每个Agent对象都是非常轻量的对象,它与线程并没有对应关系。因此,即便是创建了大量的Agent对象也不会占用太多系统资源,F#会基于.NET 4.0中的TPL来合理并充分利用计算能力。

 

出处:http://hi.baidu.com/lewutian/blog/category/%D1%A7f%23%D3%EF%D1%D4

参考:http://developer.51cto.com/art/200906/131896.htm

 

原文:F# for Parallel and Asynchronous Programming - PDC 2009

image

Last November at PDC 2009 in Los Angeles I gave a talk on F# for Parallel and Asynchronous Programming. 

The talk begins by covering basic F# concepts, and then focuses on four challenging issues related to concurrency and the tools F# brings for addressing these - immutability, async workflows, and agents.  In trying to cover this ground in a 60 minute talk, there were naturally a lot of relevant topics I had to skip, and I also got a number of questions about things I'd glossed over after the talk.  So I thought I'd use this post to discuss some of the other features and topics I would have highlighted in this presentation if there'd been more time.
 

Note:  The topics covered below mostly assume you've either watched the session video or taken a look through the source code attached above.  If you haven't yet - check out the talk.

 

Some F# Parallel and Async Topics


PSeq

The talk briefly shows the use of a PSeq module to write code like the following:

let rec sumOfSquares nums =     
    nums    
    |> PSeq.map sqr    
    |> PSeq.sum

As a few folks have noted, the F# core library does not actually provide this PSeq module.  In the demo code, I was just using my own little implementation of this, to show off the benefits of the more declarative programming style used here. 

The PSeq I used is just a very simple wrapper around the functionality provided in PLinq - for example, PSeq.map just calls ParallelEnumerable.Select.  We are expecting a future release of the F# PowerPack to include this wrapper library.  In the the meantime, you can check out Matt Podwysocki's blog post on using PLinq from F#, or Talbot Crowell's recent post with an updated Beta2 version.

 

Using Begin/End Methods with F# Async

In showing the ease of turning synchronous code which calls long-running I/O operations into asynchronous code, I showed adding the pieces highlighed in red below to the code below to make it async:

let downloadImageAsync(blob : CloudBlob) =
  async {
    let! pixels = blob.AsyncDownloadByteArray()
    let fileName = "thumbs-" + blob.Uri.Segments.[blob.Uri.Segments.Length-1]
    use outStream =  File.OpenWrite(fileName)
    do! outStream.AsyncWrite(pixels, 0, pixels.Length)
    return fileName  }

I got a very good question after the talk though: "Where did the AsyncDownloadByteArray function come from?" 

In general, for any asynchronous API provided on the platform, it can be simply wrapped and exposed as a method returning Async<T> to be used within F# async workflows as above.  In the core F# libraries, we wrap many of the core async APIs in .NET Framework, providing extension methods on the relevant types.  This means that functions like Stream.AsyncWrite used above are available by default in F#.  However, CloudBlob.DownloadByteArray is defined in the libraries provided with the Windows Azure SDK, and so we need to create a wrapper.

In fact, the Azure SDK doesn't even provide a BeginDownloadByteArray/EndDownloadByteArray pair - but it does provide a more fundamental async operation BeginDownloadToStream/EndDownloadToStream:

public IAsyncResult BeginDownloadToStream(Stream target, AsyncCallback callback, object state);
public void EndDownloadToStream(IAsyncResult asyncResult);

Notably, it's likely that the reason for not including the higher level async operations is just due to the substantial complexity of authoring and maintaining these methods in C# today. Had the Azure team written their API in F#, it's likely they would have found it sufficiently easy to add these.

So, what does the wrapper look like?

type CloudBlob with
     /// Asynchronously download blob contents as an array
    member blob.AsyncDownloadByteArray() = async {
        let ms = new MemoryStream(int blob.Properties.Length)
        do! Async.FromBeginEnd(
                    (fun (cb,s) -> blob.BeginDownloadToStream(ms,cb,s)),
                     blob.EndDownloadToStream)
        return ms.ToArray() }

Some notes on this implementation:

Async.FromBeginEnd

The key tool for wrapping many .NET async APIs is to use Async.FromBeginEnd.  This method takes two functions as arguments - the Begin/End pair of the method being wrapped - and returns a function of type Async<T> where T is the type returned from the End method. When directly wrapping a method, this is often trivial - member foo.AsyncBar() = Async.FromBeginEnd(foo.BeginBar, foo.EndBar).

Composing Async Operations

There are generally two ways to create Async<T> values - call a function which returns one, like Async.FromBeginEnd, Stream.AsyncWrite, or Async.Parallel - or write an async {.}  block to include any arbitrary F# code as part of the composition.  In the case above, we need to both call the underlying Begin/End pair, but also construct a stream to download into, and ultimately return the array of bytes left in the resulting stream.  This composition of asynchronous operations with some synchronous code is easy to do inside the async {.} block.

Other Async Interop Utilities

We saw details above of wrapping Begin/End pairs as Async<T> objects - there are a few other interesting interop scenarios which are nicely supported by the Async API in F#:

  • Async.AsBeginEnd - coverts an async operation into a set of Begin/End/Cancel functions which can be used to expose an implementation of the .NET Asynchronous Programming Model to other .NET consumers.
  • Async.AwaitEvent/AwaitWaitHandle/AwaitIAsyncResult/AwaitTask - there are many other async models used in .NET beyond the APM, and F# provides primitives for easily wrapping any of these.

 

Exceptions with Async

In the talk, I commented on how hard it is to do correct exception handling with typical asynchronous programming tools available today in C#/VB.  But I didn't actually show how much easier this is with F# async workflows.  Here's what it looks like to make our downloadImageAsync function robust to the many exceptions that could possibly be raised during it's execution:

let downloadImageAsync(blob : CloudBlob) =  async {
    try
        let! pixels = blob.AsyncDownloadByteArray()
        let fileName = "thumbs-" + blob.Uri.Segments.[blob.Uri.Segments.Length-1]
        use outStream =  File.OpenWrite(fileName)
        do! outStream.AsyncWrite(pixels, 0, pixels.Length)
        return fileName
    with
    | e ->
        Log.Write(e.ToString())
        return ""  }

The key here is that exception handling is no different for async code than for synchronous code.  Under the hood, F# will ensure that exceptions occurring anywhere in the workflow are bubbled up through the non-blocking calls and ultimately handled by the exception handler code.  This will happen as expected even when the code executes on multiple different threads during it's operation.

 

Cancellation with Async

When we have long-running operations in our programs, as well as making them non-blocking, we very often want to make them cancellable.  F# async workflows support cancellation automatically, inserting cancellation checks at each let! or do!, and also at each iteration of a loop.  Cancellation is tracked and triggered using the new unified cancellation model in the .NET Framework, "System.Threading.CancellationToken".  For example, if we want to make downloadImageAsync cancellable, we simply need to pass in a cancellation token when we start the workflow, and then hook up a button to cancel a running operation.  The deltas to the PhotoViewer application are indicated in red below:

F# code:

member this.DownloadAll(cancellationToken) =
    let work =
        container.ListBlobs()
        |> Seq.map (fun blob ->
            downloadImageAsync(container.GetBlobReference(blob.Uri.ToString())))
        |> Async.Parallel
    Async.StartAsTask(work, cancellationToken = cancellationToken) 

C# client code:

CancellationTokenSource cancellableWork;

private void Start_Click(object sender, RoutedEventArgs e){
    cancellableWork = new CancellationTokenSource();
    downloader.DownloadAll(cancellableWork.Token);
}
private void Cancel_Click(object sender, RoutedEventArgs e){
    if (cancellableWork != null)
        cancellableWork.Cancel();
}
 

Other Great F# Async Samples

Don has recently posted two great articles on some other end-to-end samples built on F# Async.  These posts cover many of the same core F# Async topics, plus some additional topics not touched on here.  The samples are also a lot of fun, using data pulled from Twitter and Bing Translation services.

 

Summary

F#'s immutability, async workflows and agents are useful features for enabling easier parallel and asynchronous programming in application code.  The PDC talk provides a good starting point for understanding how these features can be applied.  The topics above cover a few of the related topics that I've been asked about since then, but there is also much, much more that can be enabled on top of these core features.

Published Monday, February 01, 2010 2:43 AM by LukeH

Comment Notification

If you would like to receive an email when updates are made to this post, please register here

Subscribe to this post's comments using RSS

Comments

# re: F# for Parallel and Asynchronous Programming - PDC 2009

Tuesday, February 02, 2010 6:46 AM by Matt H

One thing that should probably be mentioned along with Exceptions with Async: if you want to throw an exception from within an async { } block, you need to write "return raise", not just "raise". That one tripped me up.

# re: F# for Parallel and Asynchronous Programming - PDC 2009

Thursday, February 04, 2010 8:38 AM by LukeH

Matt - That's right.  As a rule, every branch of an async block must "return" a value.  Raise is not treated specially here, so there still must be a "return", even if it will not be reachable at runtime.

分享到:
评论

相关推荐

Global site tag (gtag.js) - Google Analytics